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Field Of Invention 
The present invention is in the field of 
hepatitis virology. The invention relates to the complete 
nucleotide and deduced amino acid sequences of the envelope 
1 (El) gene of 51 hepatitis C virus (HCV) isolates from 
around the world and the grouping of these isolates into 
twelve distinct HCV genotypes. More specifically, this 
invention relates to oligonucleotides, peptides and 
recombinant proteins derived from the envelope 1 gene 
sequences of the 51 isolates of hepatitis C virus and to 
diagnostic methods and vaccines which employ these 
reagents . 

Background Of Invention 
Hepatitis C, originally called non-A, non-B 
hepatitis, was first described in 1975 as a disease 
serologically distinct from hepatitis A and hepatitis B 
(Feinstone, S.M. et al. (1975) N. Engl. J, Med. 292:767- 
770) . Although hepatitis C was (and is) the leading type 
of transfusion-associated hepatitis as well as an important 
part of community- acquired hepatitis, little progress was 
madie in understanding the disease until the recent 
identification of hepatitis C virus (HCV) as the causative 
agent of hepatitis C via the cloning and sequencing of the 
HCV genome (Choo, A.L. et al. (1989) Science 288:359-362). 
The sequence information generated by this study resulted 
in the characterization of HCV as a small, enveloped, 
positive-stranded RNA virus and led to the demonstration 
that HCV is a major cause of both acute and chronic 
hepatitis worldwide (Weiner, A.J. et al. (1990) Lancet 
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335:1-3). These observations, combined with studies 
showing that over 50% of acute cases of hepatitis C 
progress to chronicity with 20% of these resulting in 
cirrhosis and an undetermined proportion progressing to 
liver cancer, have led to tremendous efforts by 
5 investigators within the hepatitis C field to develop 

diagnostic assays and vaccines which can detect and prevent 
hepatitis C infection. 

The cloning and sequencing of the HCV genome by 
Choo et al. (1989) has permitted the development of 

10 serologic tests which can detect HCV or antibody to HCV 

(Kuo, G. et al. (1989) Science 244:362-364). In addition, 
the work of Choo et al. has also allowed the development of 
methods for detecting HCV infection via amplification of 
HCV RNA sequences by reverse transcription and cDNA 

15 polymerase chain reaction (RT-PCR) using primers derived 
from the HCV genomic sequence (Weiner, A.J. et al.). 
However, although the development of these diagnostic 
methods has resulted in inqproved diagnosis of HCV 
infection, only approximately 60% of cases of hepatitis C 

20 are associated with a factor identified as contributing to 
transmission of HCV (Alter, M.J. et al. (1989) JAMA 
262:1201-1205). This observation suggests that effective 
control of hepatitis C transmission is likely to occur only 
via universal pediatric vaccination as has been initiated 

25 recently for hepatitis B virus. Unf ortunately, attempts to 
date to protect chimpanzees from hepatitis C infection via 
administration of recombinant vaccines have had only 
limited success. Moreover, the apparent genetic 
heterogeneity of HCV, as indicated by the recent assignment 

30 of all available HCV isolates to one of four genotypes, I- 
IV (Okamoto, H. et al. (1992) J. Gen. Virol; 73:673-679), 
presents additional hurdles which must be overcome in order 
to develop accurate and effective diagnostic assays and 
vaccines . 

35 For example, one possible obstacle to the 
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development of effective hepatitis C vaccines would arise 
if the observed genetic heterogeneity of HCV reflects 
serologic heterogeneity. In such a case, the most 
genetically diverse strains of HCV may then represent 
different serotypes of HCV with the result being that 
infection with one strain nay not protect against infection 
with another. Indeed, the inability of one strain to 
protect against infection with another strain was recently 
noted by both Farci et al. (Farci, P. et al. (1992) Science 
258:135-140) and Prince et al. (Prince, A.M. et al. (1992) 
J. Infect. Dis. 165:438-443), each of whom presented 
evidence that while infection with one strain of HCV does 
modify the degree of the hepatitis C associated with the 
reinfection, it does not protect against reinfection with a 
closely related strain. The genetic heterogeneity among 
different HCV strains also increases the difficulty 
encountered in developing RT-PCR assays to detect HCV 
infection since such heterogeneity often results in false- 
negative results because of primer and template mismatch. 
In addition, currently used serologic tests for detection 
of HCV or for detection of antibody to HCV are not 
sufficiently well developed to detect all of the HCV 
genotypes which might exist in a given blood san$>le. 
Finally, in terms of choosing the proper treatment modality 
to combat hepatitis infection, the inability of presently 
25 available serologic assays to distinguish among the various 
genotypes of HCV represents a significant shortcoming in 
that recent reports suggest that an HCV- infected patient's 
response to therapy might be related to the genotype of the 
infectious virus (Yoshioka, K. et al. (1992) Hepatology 
16:293-299; Kanai, K. et al. (1992) Lancet 339:1543; Lan, 
J.Y.N, et al. (1992) Hepatology 16:209A). Indeed, the data 
presented in the above studies suggest that the closely 
related genotypes I and II are less responsive to 
interferon therapy than are the closely related genotypes 
III and IV. Moreover, preliminary data by Pozzato et al. 
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(Pozzato, G. et al. (1991) Lancet 338:509) suggests that 
different genotypes may be associated with different types 
or degrees of clinical disease. Taken together, these 
studies suggest that before effective vaccines against HCV 
infection can be developed, and indeed, before more 
5 accurate and effective methods for diagnosis and treatment 
of HCV infection can be produced, one must obtain a greater 
knowledge about the genetic and serologic diversity of HCV 
isolates . 

In a recent attempt to gain an understanding of 
10 the extent of genetic heterogeneity among HCV strains, Bukh 
et al. carried out a detailed analysis of HCV isolates via 
the use of PCR technology to amplify different regions of 
the HCV genome (Bukh, J . et al. (1992a) Proc. Natl. Acad. 
Sci. 89:187-191). Following PCR amplification, the 5'- 

15 noncoding (5' NC) portion of the genomes of various HCV 

isolates were sequenced and it was found that primer pairs 
designed from conserved regions of the 5' NC region of the 
HCV genome were more sensitive for detecting the presence 
of HCV than were primer pairs representing other portions 

20 of the genome (Bukh, J. et al. (1992b) Proc. Natl. Acad. 

Sci. U.S.A. 89:4942-4946). In addition, the authors noted 
that although many of the HCV isolates examined could be 
classified into the four genotypes described by Okamoto et 
al. (1992), other previously undescribed genotypes emerged 

25 based on genetic heterogeneity observed in the 5' NC region 
of the various isolates. One of the most prominent of 
these newly noted genotypes comprised a group of related 
viruses that contained the most genetically divergent 5' NC 
regions of those studied. This group of viruses, 

30 tentatively classified as a fifth genotype, are very 

similar to strains recently described by others (Cha, T. -A 
et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:7144-7148; 
Chan, S-W. et al. (1992) J. Gen. Virol., 73:1131-1141 and 
Lee, C-H et al. (1992) J. Clin. Microbio. 30:1602-1604). 

35 In addition, at least four more putative genotypes were 
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identified thereby providing evidence that the genetic 
heterogeneity of HCV was more extensive than previously 
appreciated. 

However, while the studies of Bukh et al. (1992a 
and b) provided new and useful information on the genetic 
5 heterogeneity of HCV, it is widely appreciated by those 

skilled in the art that the three structural genes of HCV, 
core (C) , envelope (El) and envelope 2 /nonstructural 1 
(E2/NS1) are the most important for the development of 
serologic diagnostics and vaccines since it is the product 

10 of these genes that constitutes the hepatitis C virion. 

Thus, a determination of the nucleotide sequence of one or 
all of the structural genes of a variety of HCV isolates 
would be useful in designing reagents for use in diagnostic 
assays and vaccines since a demonstration of genetic 

15 heterogeneity in a structural gene(s) of HCV isolates might 
suggest that some of the HCV genotypes represent distinct 
serotypes of HCV based upon the previously observed 
relationship between genetic heterogeneity and serologic 
heterogeneity among another group of single-stranded, 

20 positive -sense RNA viruses, the picornaviruses (Ruechert, 
R.R. "Picornaviridae and their replication", in Fields, 
B.N. et al., eds. Virology, New York: Raven Press, Ltd. 
(1990) 507-548). 

25 Summary of Invent inn 

The present invention relates to 51 cDNAs, each 
encoding the complete nucleotide sequence of the envelope 1 
(El) gene of an isolate of human hepatitis C virus (HCV) . 

The present invention also relates to the nucleic 
30 acid and deduced amino acid sequences of these El cDNAs. 

It is an object of this invention to provide 
synthetic nucleic acid sequences capable of directing 
production of recombinant El proteins, as well as 
equivalent natural nucleic acid sequences. Such natural 
35 nucleic acid sequences may be isolated from a cDNA or 
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genomic library from which the gene capable of directing 
synthesis of the El proteins may be identified and 
isolated. For purposes of this application, nucleic acid 
sequence refers to RNA, DNA, cDNA or any synthetic variant 
thereof which encodes for peptides. 
5 The invention also relates to the method of 

preparing recombinant El proteins derived from the El cDNA 
sequences by cloning the nucleic acid and inserting the 
cDNA into an expression vector and expressing the 
recombinant protein in a host cell. 

10 The invention also relates to isolated and 

substantially purified recombinant El proteins and analogs 
thereof encoded by the El cDNAs. 

The invention further relates to the use of 
recombinant El proteins as diagnostic agents and as 

15 vaccines . 

The invention also relates to the use of single - 
stranded antisense poly- or oligonucleotides derived from 
the El cDNAs to inhibit the expression of the hepatitis C 
El gene. 

20 The invention further relates to multiple 

computer -generated alignments of the nucleotide and deduced 
amino acid sequences of the 51 El cDNAs. These multiple 
sequence alignments serve to highlight regions of homology 
and non-homology between different sequences and hence, can 
25 be used by one skilled in the art to design peptides and 
oligonucleotides useful as reagents in diagnostic assays 
and vaccines. 

The invention therefore also relates to purified 
and isolated peptides and analogs thereof derived from El 
30 cDNA sequences. 

The invention further relates to the use of these 
peptides as diagnostic agents and vaccines. 

The present invention also encompasses methods of 
detecting antibodies specific for hepatitis C virus in 
35 biological samples. The methods of detecting HCV or 
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antibodies to HCV disclosed in the present invention are 
useful for diagnosis of infection and disease caused by HCV 
and for monitoring the progression of such disease. Such 
methods are also useful for monitoring the efficacy of 
therapeutic agents during the course of treatment of HCV 
infection and disease in a mammal. 

The invention also provides a kit for the 
detection of antibodies specific for HCV in a biological 
sample where said kit contains at least one purified and 
isolated peptide derived from the El cDNA sequences. 

The invention further provides isolated and 
purified genotype -specific oligonucleotides and analogs 
thereof derived from El cDNA sequences. 

The invention also relates to a method for 
detecting the presence of hepatitis C virus in a mammal, 
15 said method comprising analyzing the RNA of a mammal for 
the presence of hepatitis C virus. The invention further 
relates to a method for determining the genotype of 
hepatitis C virus present in a mammal. This method is 
useful in determining the proper course of treatment for an 
20 HCV- infected patient. 

The invention also provides a diagnostic kit for 
the detection of hepatitis C virus in a biological sample. 
The kit comprises purified and isolated nucleic acid 
sequences useful as primers for reverse -transcription 
25 polymerase chain reaction (RT-PCR) analysis of RNA for the 
presence of hepatitis C virus. 

The invention further provides a diagnostic kit 
for the determination of the genotype of a hepatitis C 
virus present in a mammal. The kit comprises purified and 
isolated nucleic acid sequences useful as primers for RT- 
PCR analysis of RNA for the presence of HCV in a biological 
sample and purified and isolated nucleic acid sequences 
useful as hybridization probes in determining the genotype 
of the HCV isolate detected in PCR. 
35 This invention also relates to pharmaceutical 



30 



WO 95/01442 PCT/US94/07320 



10 



- 8 - 

compositions for use in prevention or treatment of 
hepatitis C in a mammal. 

Description of Figures 
Figures 1 A-H show computer generated sequence 
alignments of the nucleotide sequences of the 51 HCV El 
cDNAs . The single letter abbreviations used for the 
nucleotides shown in Figures 1A-H are those standardly used 
in the art. Figure 1A shows the alignment of SEQ ID N0s:l- 
8 to produce a consensus sequence for genotype I/la. 
Figure IB shows the alignment of SEQ ID NOs:9-25 to produce 
a consensus sequence for genotype Il/lb. Figure 1C shows 
the alignment of SEQ ID N0s:26-29 to produce a consensus 
sequence for genotype III/2a. Figure ID shows the 
alignment of SEQ ID NOs: 30-33 to produce a consensus 
15 sequence for genotype IV/2b. Figure IE shows the alignment 
of SEQ ID NOs: 35-39 to produce a consensus sequence for 
genotype V/3a. Figure IF shows the computer alignment of 
SEQ ID NOs: 42 -43 to produce a consensus sequence for 
genotype 4C. Figure 1G shows the alignment of SEQ ID 
20 NOs: 45 -50 to produce a consensus sequence for genotype 5a. 
The nucleotides shown in capital letters in the consensus 
sequences of Figures 1A-G are those conserved within a 
genotype while nucleotides shown in lower case letters in 
the consensus sequences are those variable within a 
25 genotype. In addition, in Figures 1A-E and 1G, when the 
lower case letter is shown in a consensus sequence, the 
lower case letter represents the nucleotide found most 
frequently in the sequences aligned to produce the 
consensus sequence. In Figure IE, the lower case letters 
30 shown in the consensus sequence are nucleotides in SEQ ID 
NO: 42 which differ from nucleotides found in the same 
positions in SEQ ID NO: 43. Finally, a hyphen at a 
nucleotide position in the consensus sequences in Figures 
1A-6 indicates that two nucleotides were found in equal 
35 numbers at that position in the aligned sequences. In the 
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aligned sequences, nucleotides are shown in lower case 
letters if they differed from the nucleotides of both 
adjacent isolates. Figure 1H shows the alignment of the 
consensus sequences of Figures 1A-G with SEQ ID NO: 34 
(genotype 2c), SEQ ID N0:40 (genotype 4a), SEQ ID NO:41 
5 (genotype 4b), SEQ ID N0:44 (genotype 4d) and SEQ ID NO:51 
(genotype 6a) to produce a consensus sequence for all 
twelve genotypes. This consensus sequence is shown as the 
bottom line of Figure 1H where the nucleotides shown in 
capital letters are conserved among all genotypes and a 
10 blank space indicates that the nucleotide at that position 
is not conserved among all genotypes. 

Figures 2A-H show computer alignments of the 
deduced amino acid sequences of the 51 HCV El cDNAs * The 
single letter abbreviations used for the amino acids shown 
15 in Figures 2A-H follow the conventional amino acid 

shorthand for the twenty naturally occurring amino acids. 
Figure 2A shows the alignment of SEQ ID N0s:52-59 to 
produce a consensus sequence for genotype I/la. Figure 2B 
shows the alignment of SEQ ID NOs:60-76 to produce a 
20 consensus sequence for genotype Il/lb. Figure 2C shows the 
alignment of SEQ ID NOs:77-80 to produce a consensus 
sequence for genotype III/2a. Figure 2D shows the 
alignment of SEQ ID NOs:81-84 to produce a consensus 
sequence for genotype IV/2b. Figure 2E shows the alignment 
25 of SEQ ID NOs:86-90 to produce a consensus sequence for 

genotype V/3a. Figure 2F shows the computer alignment of 
SEQ ID N0s:93-94 to produce a consensus sequence for 
genotype 4c. Figure 2G shows the alignment of SEQ ID 
NOs: 96-101 to produce a consensus sequence for genotype 5a. 
30 The amino acids shown in capital letters in the consensus 
sequences of Figures 2A-G are those conserved within a 
genotype while amino acids shown in lower case letters in 
the consensus sequences are those variable within a 
genotype. In addition, in Figures 2A-E and 2G when the 
35 lower case letter is shown in a consensus sequence, the 
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letter represents the amino acid found most frequently in 
the sequences aligned to produce the consensus sequence* 
In Figure 2E, the lower case letters shown in the consensus 
sequence are amino acids in SEQ ID NO: 93 which differ from 
amino acids found in the same positions in SEQ ID NO: 94. 
5 Finally, a hyphen at an amino acid position in the 

consensus sequences of Figures 2A-G indicates that two 
amino acids were found in equal numbers at that position in 
the aligned sequences. In the aligned sequences, amino 
acids are shown in lower case letters if they differed from 
10 the amino acids of both adjacent isolates. Figure 2H shows 
the alignment of the consensus sequences of Figures 1A-G 
with SEQ ID NO: 85 (genotype 2c), SEQ ID NO: 91 (genotype 
4a), SEQ ID NO: 92 (genotype 4b), SEQ ID NO: 95 (genotype 4d) 
and SEQ ID NO: 102 (genotype 6a) to produce a consensus 

15 sequence for all twelve genotypes. This consensus sequence 
is shown as the bottom line of Figure 2H where the amino 
acids shown in capital letters are conserved among all 
genotypes and a blank space indicates that the amino acid 
at that position is not conserved among all genotypes. 

20 Figure 3 shows multiple sequence alignment of the 

deduced amino acid sequence of the El gene of 51 HCV 
isolates collected worldwide. The consensus sequence of 
the El protein is shown in boldface (top) . In the 
consensus sequence cysteine residues are highlighted with 

25 stars, potential N-linked glycosylation sites are 

underlined, and invariant amino acids are capitalized, 
whereas variable amino acids are shown in lower case 
letters. In the alignment, amino acids are shown in lower 
case letters if they differed from the amino acid of both 

30 adjacent isolates. Amino acid residues shown in bold print 
in the alignment represent residues which at that position 
in the amino acid sequence are genotype-specific. Amino 
acids that were invariant among all HCV isolates are shown 
as hyphens (-) in the alignment. Amino acid positions 

35 correspond to those of the HCV prototype sequence (HCV-1, 
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Choo, L. et al. (1991) Proc. Natl. Acad. Sci. USA 88:2451- 
2455) with the first amino acid of the El protein at 
position 192. The grouping of isolates into 12 genotypes 
(I/la, Il/lb, III/2a, IV/2b, V/3a # 2c, 4a, 4b, 4c, 4d, 5a 
and 6a) is indicated. 

Figure 4 shows a dendrogram of the genetic 
relatedness of the twelve genotypes of HCV based on the 
percent amino acid identity of the El gene of the HCV 
genome. The twelve genotypes shown are designated as I/la, 
Il/lb, III/2a, IV/2b, V/3a, 2c, 4a, 4b, 4c, 4d, 5a and 6a. 
The shaded bars represent a range showing the maximum and 
minimum homology between the amino acid sequence of any one 
isolate of the genotype indicated and the amino acid 
sequence of any other isolate. 

Figure 5 shows the distribution of the complete 
El gene sequence of 74 HCV isolates into the twelve HCV 
genotypes in the 12 countries studied. For 51 of these HCV 
isolates, including 8 isolates of genotype I/la, 17 
isolates of genotype II/ lb and 26 isolates conprising the 
additional 10 genotypes, the complete El gene sequence was 
determined. In the remaining 23 isolates, all of genotypes 
I/la and Il/lb, the genotype assignment was based on only a 
partial El gene sequence. The partially sequenced isolates 
did not represent additional genotypes in any of the 12 
countries. The number of isolates of a particular genotype 
25 is given in each of the 12 countries studied. For ease of 
viewing, those genotypes designated by two terms (e.g., 
I/la) are indicated by the latter term (e.g. la). The 
designations used for each country are: Denmark (DK) ; 
Dominican Republic (DR) ; Germany (D) ; Hong Kong (HK) ; India 
(IND); Sardinia, Italy (S) ; Peru (P) ; South Africa (SA) ; 
Sweden (SW) ; Taiwan (T) ; United States (US) ; and Zaire (2) . 
National borders depicted in this figure represent those 
existing at the time of sampling. 
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Detailed Description Of Invention 
The present invention relates to 51 cDNAs, each 
encoding the complete nucleotide sequence of the envelope 1 
(El) gene of an isolate of human hepatitis C virus (HCV) . 
The cDNAs of the present invention were obtained as 
5 follows. Viral RNA was extracted from serum collected from 
humans infected with hepatitis C virus and the viral RNA 
was then reverse transcribed and amplified by polymerase 
chain reaction using primers deduced from the sequence of 
the HCV strain H-77 (Ogata, N. et al. (1991) Proc. Natl. 
10 Acad. Sci. U.S.A. 88:3392-3396). The amplified cDNA was 
then isolated by gel electrophoresis and sequenced. 

The present invention further relates to the 
nucleotide sequences of the cDNAs encoding the El gene of 
the 51 HCV isolates. These nucleotide sequences are shown 
15 in the sequence listing as SEQ ID NO:l through SEQ ID 
NO:51. 

The abbreviations used for the nucleotides are 
those standardly used in the art. 

The deduced amino acid sequence of each of SEQ ID 
20 N0:1 through SEQ ID NO: 51 are presented in the sequence 
listing as SEQ ID NO: 52 through SEQ ID NO: 102 where the 
amino acid sequence in SEQ ID NO: 52 is deduced from the 
nucleotide sequence shown in SEQ ID NO:l, the amino acid 
sequence shown in SEQ ID NO: 53 is deduced from the 
25 nucleotide sequence shown in SEQ ID NO: 2 and so on. The 
deduced amino acid sequence of each of SEQ ID Nos:52-102 
starts at nucleotide 1 of the corresponding sequence shown 
in SEQ ID N0s:l-51 and extends 595 nucleotides. 

The three letter abbreviations used in SEQ ID 
30 Nos:52-102 follow the conventional amino acid shorthand for 
the twenty naturally occurring amino acids. 

Preferably, the El proteins or peptides of the 
present invention are substantially homologous to, and most 
preferably biologically equivalent to, the native HCV El 
35 proteins or peptides. By "biologically equivalent" as used 



WO 95/01442 



PCT7US94/07320 



- 13 - 



10 



throughout the specification and claims, it is meant that 
the compositions are immunogenically equivalent to the 
native El proteins or peptides. The El proteins or 
peptides of the present invention may also stimulate the 
production of protective antibodies upon injection into a 
mammal that would serve to protect the mammal upon 
challenge with HCV. By "substantially homologous" as used 
throughout the ensuing specification and claims to describe 
El proteins and peptides, it is meant a degree of homology 
in the amino acid sequence to the native El proteins or 
peptides. Preferably the degree of homology is in excess 
of 90, preferably in excess of 95, with a particularly 
preferred group of proteins being in excess of 99 
homologous with the native El proteins or peptides. 

Variations are contemplated in the cDNA sequences 
15 shown in SEQ ID NO: 1 through SEQ ID NO: 51 which will result 
in a DNA sequence that is capable of directing production 
of analogs of the corresponding envelope l (El) protein 
shown in SEQ ID NO: 52 through SEQ ID NO: 102. It should be 
noted that the DNA sequences set forth above represent a 
preferred embodiment of the present invention. Due to the 
degeneracy of the genetic code, it is to be understood that 
numerous choices of nucleotides may be made that will lead 
to a DNA sequence capable of directing production of the 
instant El protein or its analogs. As such, DNA sequences 
which are functionally equivalent to the sequence set forth 
above or which are functionally equivalent to sequences 
that would direct production of analogs of the El proteins 
produced pursuant to the amino acid sequences set forth 
above, are intended to be encompassed within the present 
30 invention. 

The term analog as used throughout the 
specification or claims to describe the El proteins or 
peptides of the present invention, includes any protein or 
peptide having an amino acid residue sequence substantially 
35 identical to a sequence specifically shown herein in which 
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° one or more residues have been conservatively substituted 
with a biologically equivalent residue. Examples of 
conservative substitutions include the substitution of one- 
polar (hydrophobic) residue such as isoleucine, valine, 
leucine or methionine for another, the substitution of one 
5 polar (hydrophilic) residue for another such as between 
arginine and lysine, between glutamine and asparagine, 
between glycine and serine, the substitution of one basic 
residue such as lysine, arginine or histidine for smother, 
or the substitution of one acidic residue, such as aspartic 

10 acid or glutamic acid for another. 

The phrase "conservative substitution" also 
includes the use of a chemically derivatized residue in 
place of a non- derivatized residue provided that the 
resulting protein or peptide is biologically equivalent to 

15 the native El protein or peptide. 

"Chemical derivative" refers to an El protein or 
peptide having one or more residues chemically derivatized 
by reaction of a functional side group. Examples of such 
derivatized molecules, include but are not limited to, 

20 those molecules in which free amino groups have been 
derivatized to form amine hydrochlorides, p- toluene 
sulfonyl groups, carbobenzoxy groups, t-butyloxycarbonyl 
groups , chloracetyl groups or f ormyl groups . Free carboxyl 
groups may be derivatized to form salts, methyl and ethyl 

25 esters or other types of esters or hydrazides. Free 

hydroxyl groups may be derivatized to form O-acyl or O- 
alkyl derivatives. The imidazole nitrogen of histidine may 
be derivatized to form N-imbenzylhistidine. Also included 
as chemical derivatives are those proteins or peptides 

30 which contain one or more naturally- occurring amino acid 
derivatives of the twenty standard amino acids. For 
examples: 4-hydroxyproline may be substituted for proline; 
5 -hydroxy lysine may be substituted for lysine; 3- 
methylhistidine may be substituted for histidine; 

35 homoserine may be substituted for serine; and ornithine may 
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be substituted for lysine. The El protein or peptide of 
the present invention also includes any protein or peptide 
having one or more additions and/or deletions or residues 
relative to the sequence of a peptide whose sequence is 
shown herein, so long as the peptide is biologically 
equivalent to the native El protein or peptide. 

The present invention also includes a recombinant 
DNA method for the manufacture of HCV El proteins. In this 
method, natural or synthetic nucleic acid sequences may be 
used to direct the production of El proteins. 

In one embodiment of the invention, the method 

comprises : 

(a) preparation of a nucleic acid sequence 
capable of directing a host organism to produce HCV El 
protein; 

15 (b) cloning the nucleic acid sequence into a 

vector capable of being transferred into and replicated in 
a host organism, such vector containing operational 
elements for the nucleic acid sequence; 

(c) transferring the vector containing the 
nucleic acid and operational elements into a host organism 
capable of expressing the protein; 

(d) culturing the host organism under conditions 
appropriate for amplif ication of the vector and expression 
of the protein; and 

25 (e) harvesting the protein. 

In another embodiment of the invention, the 
method for the recombinant DNA synthesis of an HCV El 
protein encoded by any one of the nucleic acid sequences 
shown in SEQ ID NOs:l-51 comprises: 

30 {a) culturing a transformed or transfected host 

organism containing a nucleic acid sequence capable of 
directing the host organism to produce a protein, under 
conditions such that the protein is produced, said protein 
exhibiting substantial homology to a native El protein 

35 isolated from HCV having the amino acid sequence according 
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to any one of the amino acid sequences shown in SEQ ID 
NOs:52-102 or combinations thereof. 

In one embodiment, the RNA sequence of an HCV 
isolate was isolated and cloned to cDNA as follows. Viral 
RNA is extracted from a biological sample collected from 
human subjects infected with hepatitis C and the viral RNA 
is then reverse transcribed and amplified by polymerase 
chain reaction using primers deduced from the sequence of 
HCV strain H-77 (Ogata et al. (1991)). Preferred primer 
sequences are shown as SEQ ID NOs:103-108 in the sequence 
listing. Once amplified, the PCR fragments are isolated by 
gel electrophoresis and sequenced. 

The vectors contemplated for use in the present 
invention include any vectors into which a nucleic acid 
sequence as described above can be inserted, along with any 
15 preferred or required operational elements, and which 

vector can then be subsequently transferred into a host 
organism and replicated in such organisms. Preferred 
vectors are" those whose restriction sites have been well 
documented and which contain the operational elements 
20 preferred or required for transcription of the nucleic acid 
sequence . 

The "operational elements" as discussed herein 
include at least one promoter, at least one operator, at 
least one leader sequence, at least one terminator codon, 

25 and any other DNA sequences necessary or preferred for 

appropriate transcription and subsequent translation of the 
vector nucleic acid. In particular, it is contemplated 
that such vectors will contain at least one origin of 
replication recognized by the host organism along with at 

30 least one selectable markers and at least one promoter 

sequence capable of initiating transcription of the nucleic 
acid sequence. 

In construction of the recombinant for expression 
cloning vector of the present invention, it should 
35 additionally be noted that multiple copies of the nucleic 
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acid sequence and its attendant operational elements may be 
inserted into each vector. In such an embodiment, the host 
organism would produce greater amounts per vector of the 
desired El protein. The number of multiple copies of the 
DNA sequence which may be inserted into the vector is 
5 limited only by the ability of the resultant vector due to 
its size, to be transferred into and replicated and 
transcribed in an appropriate host microorganism. 

In another embodiment, restriction digest 
fragments containing a coding sequence for El proteins can 
10 be inserted into a suitable expression vector that 

functions in prokaryotic or eukaryotic cells. By suitable 
is meant that the vector is capable of carrying and 
expressing a complete nucleic acid sequence coding for El 
protein. Preferred expression vectors are those that 

15 function in a eukaryotic cell. Examples of such vectors 
include but are not limited to vaccinia virus vectors, 
adenovirus or herpes viruses. A preferred vector is the 
baculovirus transfer vector, pBlueBac. 

In yet another embodiment, the selected 

20 recombinant expression vector may then be transfected into 
a suitable eukaryotic cell system for purposes of 
expressing the recombinant protein. Such eukaryotic cell 
systems include but are not limited to cell lines such as 
HeLa, MRC-5 or Cv-l. A preferred eukaryotic cell system is 

25 SP9 insect cells. 

The expressed recombinant protein may be detected 
by methods known in the art including, but not limited to, 
Coomassie blue staining and Western blotting. 

The present invention also relates to 

30 substantially purified and isolated recombinant El 

proteins. In one embodiment, the recombinant protein 
expressed by the SF9 cells can be obtained as a crude 
lysate or it can be purified by standard protein 
purification procedures known in the art which may include 

35 differential precipitation, molecular sieve chromatography, 
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ion-exchange chromatography, isoelectric focusing, gel 
electrophoresis and affinity and immunoaf f inity 
chromatography. The recombinant protein may be purified by 
passage through a column containing a resin which has bound 
thereto antibodies specific for the open reading frame 
5 (ORF) protein. 

The present invention further relates to the use 
of recombinant El proteins as diagnostic agents and 
vaccines. In one embodiment, the expressed recombinant 
proteins of this invention can be used in immunoassays for 
10 diagnosing or prognosing hepatitis C in a mammal. For the 
purposes of the present invention, "mammal " as used 
throughout the specification and claims, includes, but is 
not limited to humans, chimpanzees , other primates and the 
like. In a preferred embodiment, the immunoassay is useful 
15 in diagnosing hepatitis C infection in humans. 

Immunoassays of the present invention may be a 
radioimmunoassay, Western blot assay, immunofluorescent 
assay, enzyme immunoassay, chemiluminescent assay, 
immunohistochemical assay and the like. Standard 

20 techniques known in the art for ELISA are described in 

Methods in Immu nodiaanogig . 2nd Edition, Rose and Bigazzi, 
eds., John Wiley and Sons, 1980 and Campbell et al., 
Methods of Immu nology . W.A. Benjamin, Inc., 1964, both of 
which are incorporated herein by reference. Such assays 

25 may be a direct, indirect, competitive, or noncompetitive 
immunoassay as described in the art (Oellerich, M. 1984. J. 
Clin. Chem. Clin, BioChem 22:895-904) Biological samples 
appropriate for such detection assays include, but are not 
limited to serum, liver, saliva, lymphocytes or other 

30 mononuclear cells. 

In a preferred embodiment, test serum is reacted 
with a solid phase reagent having surface -bound recombinant 
HCV El protein as an antigen. The solid surface reagent 
can be prepared by known techniques for attaching protein 

35 to solid support material. These attachment methods 
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include non-specific adsorption of the protein to the 
support or covalent attachment of the protein to a reactive 
group on the support. After reaction of the antigen with 
anti-HCV antibody, unbound serum components are removed by 
washing and the antigen -antibody complex is reacted with a 
5 secondary antibody such as labelled anti-human antibody. 
The label may be an enzyme which is detected by incubating 
the solid support in the presence of a suitable 
fluorimetric or calorimetric reagent. Other detectable 
labels may also be used, such as radiolabels or colloidal 
10 gold, and the like. 

The HCV El protein and analogs thereof may be 
prepared in the form of a kit, alone, or in combinations 
with other reagents such as secondary antibodies, for use 
in immunoassays. 

In yet another embodiment the recombinant El 
proteins or analogs thereof can be used as a vaccine to 
protect mammals against challenge with Hepatitis C. The 
vaccine, which acts as an immunogen, may be a cell, cell 
lysate from cells transfected with a recombinant expression 
vector or a culture supernatant containing the expressed 
protein. Alternatively, the immunogen is a partially or 
substantially purified recombinant protein. 

While it is possible for the immunogen to be 
administered in a pure or substantially pure form, it is 
preferable to present it as a pharmaceutical composition, 
formulation or preparation. 

The formulations of the present invention, both 
for veterinary and for human use, comprise an immunogen as 
described above, together with one or more pharmaceutically 
acceptable carriers and optionally other therapeutic 
ingredients. The carrier (s) must be "acceptable" in the 
sense of being compatible with the other ingredients of the 
formulation and not deleterious to the recipient thereof. 
The formulations may conveniently be presented in unit 
35 dosage form and may be prepared by any method well-known in 



20 



25 



30 



WO 95/01442 



PCT/US94/07320 



10 



15 



- 20 - 

the pharmaceutical art. 

All methods include the step of bringing into 
association the active ingredient with the carrier which 
constitutes one or more accessory ingredients. In general, 
the formulations are prepared by uniformly and intimately 
bringing into association the active ingredient with liquid 
carriers or finely divided solid carriers or both, and 
then, if necessary, shaping the product into the desired 
formulation* 

Formulations suitable for intravenous 
intramuscular, subcutaneous, or intraperitoneal 
administration conveniently comprise sterile aqueous 
solutions of the active ingredient with solutions which are 
preferably isotonic with the blood of the recipient. Such 
formulations may be conveniently prepared by dissolving the 
solid active ingredient in water containing physiologically 
compatible substances such as sodium chloride (e.g. 0.1- 
2.0m), glycine, and the like, and having a buffered pH 
compatible with physiological conditions to produce an 
aqueous solution, and rendering said solution sterile. 
These may be present in unit or multi-dose containers, for 
example, sealed ampoules or vials. 

The formulations of the present invention may 
incorporate a stabilizer. Illustrative stabilizers are 
preferably incorporated in an amount of 0.11-10,000 parts 
by weight per part by weight of immunogens. If two or more 
stabilizers are to be used, their total amount is 
preferably within the range specified above. These 
stabilizers are used in aqueous solutions at the 
appropriate concentration and pH. The specific osmotic 
30 pressure of such aqueous solutions is generally in the 

range of 0.1-3.0 osmoles, preferably in the range of 0.8- 
1.2. The pH of the aqueous solution is adjusted to be 
within the range of 5.0-9.0, preferably within the range of 
6-8. In formulating the immunogen of the present 
invention, anti- adsorption agent may be used. 
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Additional pharmaceutical methods may be employed 
to control the duration of action. Controlled release 
preparations may be achieved through the use of polymer to 
complex or adsorb the proteins or their derivatives. The 
controlled delivery may be exercised by selecting 
appropriate macromolecules (for example polyester, 
polyamino acids, polyvinyl pyrrolidone, 
ethylenevinylacetate , methyl cellulose , 
carbo^qnnethyl cellulose, or protamine sulfate) and the 
concentration of macromolecules as well as the methods of 
incorporation in order to control release. Another 
possible method to control the duration of action by 
controlled- release preparations is to incorporate the 
proteins, protein analogs or their functional derivatives, 
into particles of a polymeric material such as polyesters, 
polyamino acids, hydrogels, poly (lactic acid) or ethylene 
vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is 
possible to entrap these materials in microcapsules 
prepared, for example, by coacervation techniques or by 
20 interfacial polymerization, for example, 

hydroxymethylcellulose or gelatin-microcapsules and poly 
(methylmethacylate) microcapsules, respectively, or in 
colloidal drug delivery systems, for example, liposomes, 
albumin microspheres, microemulsions, nanoparticles, and 
25 nanocapsules or in macroemulsions. 

When oral preparations are desired, the 
compositions may be combined with typical carriers, such as 
lactose, sucrose, starch, talc, magnesium stearate, 
crystalline cellulose, methyl cellulose, carboxymethyl 
30 cellulose, glycerin, sodium alginate or gum arabic among 
others . 

The proteins of the present invention may be 
supplied in the form of a kit, alone, or in the form of a 
pharmaceutical composition as described above. 
35 Vaccination can be conducted by conventional 



WO 95/01442 



PCT/US94/07320 



- 22 - 

° methods. For example, the immunogen or immunogens (i.e. 
the El protein may be administered alone or in combination 
with the El proteins derived from other isolates of HCV) 
can be used in a suitable diluent such as saline or water, 
or complete or incomplete adjuvants. Further, the 
5 immunogen (s) may or may not be bound to a carrier to make 
the protein (s) immunogenic. Examples of such carrier 
molecules include but are not limited to bovine serum 
albumin (BSA) , keyhole limpet hemocyanin (KLH) , tetanus 
toxoid, and the like. The immunogen (s) can be administered 

10 by any route appropriate for antibody production such as 

intravenous , intraperitoneal , intramuscular , subcutaneous , 
and the like. The immunogen (s) may be administered once or 
at periodic intervals until a significant titer of anti-HCV 
antibody is produced. The antibody may be detected in the 

15 serum using an immunoassay. 

The administration of the immunogen (s) of the 
present invention may be for either a prophylactic or 
- therapeutic purpose. When provided prophylactically, the 
immunogen (s) is provided in advance of any exposure to HCV 

20 or in advance of any symptom of any symptoms due to HCV 
infection. The prophylactic administration of the 
immunogen serves to prevent or attenuate any subsequent 
infection of HCV in a mammal. When provided 
therapeutically, the immunogen (s) is provided at (or 

25 shortly after) the onset of the infection or at the onset 
of any symptom of infection or disease caused by HCV. The 
therapeutic administration of the immunogen (s) serves to 
attenuate the infection or disease. 

In addition to use as a vaccine, the compositions 

30 can be used to prepare antibodies to HCV El proteins. The 
antibodies can be used directly as antiviral agents. To 
prepare antibodies, a host animal is immunized using the El 
proteins native to the virus particle bound to a carrier as 
described above for vaccines. The host serum or plasma is 

35 collected following an appropriate time interval to provide 
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a composition comprising antibodies reactive with the El 
protein of the virus particle. The gamma globulin fraction 
or the IgG antibodies can be obtained, for example, by use 
of saturated ammonium sulfate or DEAE Sephadex, or other 
techniques known to those skilled in the art. The 
antibodies are substantially free of many of the adverse 
side effects which may be associated with other anti-viral 
agents such as drugs. 

The antibody compositions can be made even more 
compatible with the host system by minimizing potential 
adverse immune system responses. This is accomplished by 
removing all or a portion of the Fc portion of a foreign 
species antibody or using an antibody of the same species 
as the host animal, for example, the use of antibodies from 
human/human hybridomas. Humanized antibodies (i.e., 
nonimmunogenic in a human) may be produced, for example, by 
replacing an immunogenic portion of an antibody with a 
corresponding, but nonimmunogenic portion (i.e., chimeric 
antibodies) . Such chimeric antibodies may contain the 
reactive or antigen binding portion of an antibody from one 
species and the Fc portion of an antibody (nonimmunogenic) 
from a different species. Examples of chimeric antibodies, 
include but are not limited to, non- human mammal -human 
chimeras, rodent -human chimeras, murine -human and rat -human 
chimeras (Robinson et al.. International Patent Application 
25 184,187; Taniguchi M. , European Patent Application 171,496; 
Morrison et al . , European Patent Application 173,494; 
Neuberger et al., PCT Application WO 86/01533; Cabilly et 
al., 1987 Proc. Natl. Acad. Sci. USA 84:3439; Nishimura et 
al., 1987 Cane. Res. 47:999; Wood et al., 1985 Nature 
30 314:446; Shaw et al., 1988 J. Natl. Cancer Inst. 80:15553, 
all incorporated herein by reference) . 

General reviews of "humanized" chimeric 
antibodies are provided by Morrison S., 1985 Science 
229:1202 and by Oi et al., 1986 BioTechnigues 4:214. 
35 Suitable "humanized" antibodies can be 
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alternatively produced by CDR or CEA substitution (Jones et 
al., 1986 Nature 321:552; Verhoeyan et al., 1988 Science 
239:1534; Biedleret al. 1988 J. Immunol. 141:4053, all 
incorporated herein by reference) ♦ 

The antibodies or antigen binding fragments may 
5 also be produced by genetic engineering. The technology 
for expression of both heavy and light cain genes in E. 
fioli is the subject of the PCT patent applications; 
publication number WO 901443, WO901443, and WO 9014424 and 
in Huse et al., 1989 Science 246:1275-1281. 

10 The antibodies can also be used as a means of 

enhancing the immune response. The antibodies can be 
administered in amount similar to those used for other 
therapeutic administrations of antibody. For example, 
normal immune globulin is adaiinistered at 0.02-0.1 ml/lb 

15 body weight during the early incubation period of other 

viral diseases such as rabies, measles, and hepatitis B to 
interfere with viral entry into cells. Thus, antibodies 
reactive with the HCV El protein can be passively 
administered alone or in conjunction with another anti- 
20 viral agent to a host infected with an HCV to enhance the 
immune response and/or the effectiveness of an antiviral 
drug. 

Alternatively, anti-HCV El antibodies can be 
induced by administered anti-idiotype antibodies as 
25 immunogens. Conveniently, a purified anti-HCV El antibody 
preparation prepared as described above is used to induce 
anti-idiotype antibody in a host animal , the composition is 
administered to the host animal in a suitable diluent. 
Following administration, usually repeated administration, 
30 the host produces anti-idiotype antibody. To eliminate an 
immunogenic response to the Fc region, antibodies produced 
by the same species as the host animal can be used or the 
Fc region of the administered antibodies can be removed. 
Following induction of anti-idiotype antibody in the host 
animal, serum or plasma is removed to provide an antibody 
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composition. The composition can be purified as described 
above for anti-HCV El antibodies, or by affinity 
chromatography using anti-HCV El antibodies bound to the 
affinity matrix. The anti-idiotype antibodies produced are 
similar in conformation to the authentic HCV El protein and 
may be used to prepare an HCV vaccine rather than using an 
HCV El protein. 

When used as a means of inducing anti-HCV virus 
antibodies in an animal, the manner of injecting the 
antibody is the same as for vaccination purposes, namely 
intramuscularly, intraperitoneally, subcutaneously or the 
like in an effective concentration in a physiologically 
suitable diluent with or without adjuvant. One or more 
booster injections may be desirable. 

The HCV El proteins of the invention are also 
intended for use in producing antiserum designed for pre- 
or post -exposure prophylaxis. Here an El protein, or 
mixture of El proteins is formulated with a suitable 
adjuvant and administered by injection to human volunteers, 
according to known methods for producing human antisera. 
Antibody response to the injected proteins is monitored, 
during a several -week period following immunization, by 
periodic serum sampling to detect the presence of anti-HCV 
El serum antibodies, using an immunoassay as described 
herein. 

25 The antiserum from immunized individuals may be 

administered as a pre- exposure prophylactic measure for 
individuals who are at risk of contracting infection. The 
antiserum is also useful in treating an individual post- 
exposure, analogous to the use of high titer antiserum 

30 against hepatitis B virus for post -exposure prophylaxis. 

For both in vivo use of antibodies to HCV virus - 
like particles and proteins and anti-idiotype antibodies 
and diagnostic use, it may be preferable to use monoclonal 
antibodies. Monoclonal anti-HCV El protein antibodies or 

35 anti-idiotype antibodies can be produced as follows. The 
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spleen or lymphocytes from an immunized animal are removed 
and immortalized or used to prepare hybridomas by methods 
known to those skilled in the art. (Goding, J.W. 1983. 
Monoclonal Antibodies: Principles and Practice, Pladermic 
Press, Inc., NY, NY, pp. 56-97). To produce a human-human 
5 hybridoma, a human lymphocyte donor is selected. A donor 
known to be infected with HCV (where infection has been 
shown for example by the presence of anti -virus antibodies 
in the blood or by virus culture) may serve as a suitable 
lymphocyte donor. Lymphocytes can be isolated from a 

10 peripheral blood sample or spleen cells may be used if the 
donor is subject to splenectomy. Epstein- Barr virus (EBV) 
can be used to immortalize human lymphocytes or a human 
fusion partner can be used to produce human -human 
hybridomas. Primary in vitro immunization with peptides 

15 can also be used in the generation of human monoclonal 
antibodies . 

Antibodies secreted by the immortalized cells are 
screened to determine the clones that secrete antibodies of 
the desired specificity. For monoclonal anti-El 

20 antibodies, the antibodies must bind to HCV El protein. 
For monoclonal anti-idiotype antibodies, the antibodies 
must bind to anti-El protein antibodies. Cells producing 
antibodies of the desired specify are selected. 

The present invention also relates to the use of 

25 single- stranded antisense poly- or oligonucleotides derived 
from nucleotide sequences substantially homologous to those 
shown in SEQ ID N0s:l-51 to inhibit the expression of 
hepatitis C El genes. By substantially homologous as used 
throughout the specification and claims to describe the 

30 nucleic acid sequences of the present invention, is meant a 
level of homology between the nucleic acid sequence and the 
SEQ ID NOs. referred to in that sentence. Preferably, the 
level of homology is in excess of 80%, more preferably in 
excess of 90%, with a preferred nucleic acid sequence being 

35 in excess of 95% homologous with the DNA sequence shown in 
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the indicated SEQ ID NO. These anti- sense poly- or 
oligonucleotides can be either DNA or RNA. The targeted 
sequence is typically messenger RNA and more preferably, a 
single sequence required for processing or translation of 
the RNA. The anti -sense poly- or oligonucleotides can be 
conjugated to a polycation such as polylysine as disclosed 
in Lemaitre, M. et al. ((1989) Proc. Natl. Acad. Sci. USA 
84:648-652) and this conjugate can be administrated to a 
mammal in an amount sufficient to hybridize to and "inhibit 
the function of the messenger RNA. 

The present invention further relates to multiple 
computer- generated alignments of the nucleotide and deduced 
amino acid sequences shown in SEQ ID NOs: 1-102. Computer 
analysis of the nucleotide sequences shown in SEQ ID NOs:l- 
51 and of the deduced amino acid sequences shown in SEQ ID 
NOs: 52 -102 can be carried out using commercially available 
computer programs known to one skilled in the art. 

In one embodiment, computer analysis of SEQ ID 
NOs: 1-51 by the program GENALIGN (Intelligenetics, Inc. 
Mountainview, CA) results in distribution of the 51 
sequences into twelve genotypes based upon the degree of 
variation of the sequences. For the purposes of the 
present invention, the nucleotide sequence identity of El 
cDNAs of HCV isolates of the same genotype is in the range 
of about 85% to about 100% whereas the identity of El cDNA 
25 sequences of different genotypes is in the range of about 
50% to about 80%. 

The grouping of SEQ ID NOs: 1-51 into twelve HCV 
genotypes is shown below. 

30 
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SEP ID NQs : Genotypes 

1-8 i/ia 
9-25 Il/lb 
26-29 III/2a 
30-33 IV/2b 
34 2c 
35-39 v/3a 

40 4a 

41 4b 
42-43 4c 
44 4d 
45-50 5a 
51 6a 

For those genotypes containing more than one El 
nucleotide sequence, computer alignment of the constituent 
nucleotide sequences of the genotype was conducted using 
GENALIGN in order to produce a consensus sequence for each 
genotype. These alignments and their resultant consensus 
sequences are shown in Figures 1A-G for the seven genotypes 
(I/la, Il/lb, III/2a, IV/2b, V/3a, 4c and 5a) which 
comprise more than one nucleotide sequence* Further 
alignment of the consensus sequences of Figures 1A-G with 
SEQ ID NO: 34 (genotype 2c), SEQ ID NO: 40 (genotype 4a), SEQ 
ID NO: 41 (genotype 4b), SEQ ID NO: 44 (genotype 4d) and SEQ 
ID NO: 51 (genotype 6a) produces a consensus sequence for 
all twelve genotypes as shown in Figure 1H. The multiple 
alignments of nucleotide sequences shown in Figures 1A-H 
serve to highlight regions of homology and non-homology 
between different sequences and hence, can be used by one 
skilled in the art to design oligonucleotides useful as 
reagents in diagnostic assays for HCV* 

Examples of purified and isolated oligonucleotide 
sequences provided by the present invention are shown as 
SEQ ID NOs:109-135. The oligonucleotides shown in SEQ ID 
N0s:109-135 are useful as "genotype- specific" primers and 
probes since these oligonucleotides can hybridize 
specifically to the nucleotide sequence of the El gene of 
HCV isolates belonging to a single genotype. The genotype- 
specificity of the oligonucleotides shown in SEQ ID 
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NOs:109-135 is as follows: SEQ ID NOs: 109 -110 are specific 
for genotype I/la; SEQ ID NOs:lli-H2 are specific for 
genotype Il/lb; SEQ ID NOs:113-114 are specific for 
genotype III/2a; SEQ ID NOs:115-116 are specific for 
genotype IV/2b; SEQ ID NOs: 117-119 are specific for 
5 genotype 2c; SEQ ID NOs: 120-122 are specific for genotype 
V/3a; SEQ ID NOs:123-124 are specific for genotype 4a; SEQ 
ID NOs: 125 -125 are specific for genotype 4b; SEQ ID 
NOs:127-128 are specific for genotype 4c; SEQ ID NOs:129- 
130 are specific for genotype 4d; SEQ ID NOs:131-132 are 
10 specific for genotype 5a and SEQ ID NOs: 133 -135 are 
specific for genotype 6a. 

The oligonucleotides of this invention can be 
synthesized using any of the known methods of 
oligonucleotide synthesis (e.g., the phosphodiester method 
15 of Agarwal et al. 1972, Agnew. Chem. Int. Ed. Engl. 11:451 # 
the phosphotriester method of Hsiung et al. 1979, Nucleic 
Acids Res 6:1371, or the automated diethylphosphoramidite 
method of Baeucage et al. 1981, Tetrahedron Letters 
22:1859-1862), or they can be isolated fragments of 
20 naturally occurring or cloned DNA. In addition, those 

skilled in the art would be aware that oligonucleotides can 
be synthesized by automated instruments sold by a variety 
of manufacturers or can be commercially custom ordered and 
prepared. In a preferred embodiment, SEQ ID NO: 103 through 
25 SEQ ID NO: 135 are synthetic oligonucleotides. 

The present invention also relates to a method 
for detecting the presence of HCV in a mammal, said method 
comprising analyzing the RNA of a mammal for the presence 
of hepatitis C virus. 

30 The RNA to be analyzed can be isolated from 

serum, liver, saliva, lymphocytes or other mononuclear 
cells as viral RNA, whole cell RNA or as poly (A) + RNA. 
Whole cell RNA can be isolated by methods known to those 
skilled in the art. Such methods include extraction of RNA 

35 by differential precipitation (Birnbiom, H.C. (1988) 
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Nucleic Acids Res., 16:1487-1497), extraction of RNA by 
organic solvents (Chomczynski, P. et al. (1987) Anal. 
Biochem. , 162:156-159) and extraction of RNA with strong 
denaturants (Chirgwin, J.M. et al. (1979) Biochemistry, 
18:5294-5299) . Poly (A) + RNA can be selected from whole cell 
5 RNA by affinity chromatography on oligo-d(T) columns (Aviv, 
H. et al. (1972) Proc. Natl. Acad. Sci., 69:1408-1412). A 
preferred method of isolating RNA is extraction of viral 
RNA by the quanidium- phenol- chloroform method of BuJch et 
al. (1992a). 

10 The methods for analyzing the RNA for the 

presence of HCV include Northern blotting (Alwine, J.C. et 
al. (1977) Proc. Natl. Acad. Sci., 74:5350-5354), dot and 
slot hybridization (Kafatos, F.C. et al. (1979) Nucleic 
Acids Res., 7:1541-1522), filter hybridization (Hollander, 

15 M.C. et al. (1990) Biotechniques ; 9:174-179), RNase 
protection (Sambrook, J. et al. (1989) in "Molecular 
Cloning, A Laboratory Manual", Cold Spring Harbor Press, 
Plainview, NY) and reverse- transcription polymerase chain 
reaction (RT-PCR) (Watson, J.D. et al. (1992) in 

20 "Recombinant DNA" Second Edition, W.H. Freeman and Company, 
New York) . A preferred method is RT-PCR. In this method, 
the RNA can be reverse transcribed to first strand cDNA 
using a primer or primers derived from the nucleotide 
sequences shown in SEQ ID N0s:l-51. A preferred primer for 

25 reverse transcription is that shown in SEQ ID NO: 104. Once 
the cDNAs are synthesized, PCR amplif ication is carried out 
using pairs of primers designed to hybridize with sequences 
in the HCV El cDNA which are an appropriate distance apart 
(at least about 50 nucleotides) to permit arapl if ication of 

30 the cDNA and subsequent detection of the amplification 
product. Each primer of a pair is a single -stranded 
oligonucleotide of about 20 to about 60 bases in length 
where one primer (the "upstream" primer) is complementary 
to the original RNA and the second primer (the "downstream" 

35 primer) is complementary to the first strand of cDNA 
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generated by reverse transcriptions of the RNA. The target 
sequence is generally about 100 to about 300 base pairs 
long but can be as large as 500-1500 base pairs* 
Optimization of the amplification reaction to obtain 
sufficiently specific hybridization to the El nucleotide 
sequence is well within the skill in the art and is 
preferably achieved by adjusting the annealing temperature. 

In one embodiment, the primer pairs selected to 
amplify El cDNAs are universal primers. By "universal", as 
used to describe primers throughout the claims and 
specification, is meant those primer pairs which can 
amplify El gene fragments derived from an HCV isolate 
belonging to any one of the twelve genotypes of HCV 
described herein . Purified and isolated universal primers 
are used in Example 1 of the present invention and are 
15 shown as SEQ ID NOs:103-108 where SEQ ID N0s:103 and 104 
represent one pair of primers, SEQ ID NOs:105 and 106 
represent a second pair of primers and SEQ ID NOs: 107-108 
represent a third pair of primers. 

In an alternative embodiment, primer pairs 
20 selected to amplify El cDNAs are genotype -specific primers. 
In the present invention, genotype- specif ic primer pairs 
can readily be derived from the following genotype -specific 
nucleotide domains: nucleotides 197-238 and 450-480 of the 
consensus sequence of genotype I/la shown in Figure 1A; 
25 nucleotides 197-238 and 450-480 of the consensus sequence 
of genotype Il/lb shown in Figure IB; nucleotides 199-238 
and 438-480 of the consensus sequence of genotype III/2a 
shown in Figure C; nucleotides 124-177 and 450-480 of the 
consensus sequence of genotype IV/2b shown in Figure ID; 
30 nucleotides 124-177, 193-238 and 436-480 of SEQ ID N0:34 

(genotype 2C) ; nucleotides 168-207, 294-339 and 406-480 of 
the consensus sequence of genotype V/3a shown in Figure IE; 
nucleotides 145-183 and 439-480 of SEQ ID NO: 40 (genotype 
4a); nucleotides 168-207 and 432-480 of SEQ ID N0:41 
35 (genotype 4b); nucleotides 130-183 and 450-480 of the 
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consensus sequence of genotype 4c shown in Figure IF; 
nucleotides 130-183 and 450-480 of SEQ ID N0:44 (genotype 
4d) ; nucleotides 166-208 and 437-480 of the consensus 
sequence of genotype 5a shown in Figure lb and nucleotides 
168-207, 216-252 and 429-480 of SEQ ID N0:51 (genotype 6a), 
5 One skilled in the art would readily appreciate that in a 
pair of genotype -specific primers, each primer is derived 
from different genotype -specific nucleotide domains 
indicated above for a given genotype. Also, as described 
earlier, it is understood by one skilled in the art that 
10 each pair of primers comprises one primer which is 

complementary to the original viral RNA and the other which 
is compl ementary to the first strand of cDNA generated by 
reverse transcription of the viral RNA. For example, in a 
pair of genotype -specific primers for genotype 4b, one 

15 primer would have a nucleotide sequence derived from region 
168-207 of SEQ ID NO:40 and the other primer would have a 
nucleotide sequence which is the complement of region 432- 
480 of SEQ ID NO: 40. One skilled in the art would readily 
recognize that such genotype specific domains would also be 

20 useful in designing oligonucleotides for use as genotype - 
specific hybridization probes. Indeed, the sequences of 
such genotype- specif ic hybridization probes are disclosed 
later in the specification. 

The amplif icatiofa products of PCR can be detected 

25 either directly or indirectly. In one embodiment, direct 
detection of the amplification products is carried out via 
labelling of primer pairs. Labels suitable for labelling 
the primers of the present invention are known to one 
skilled in the art and include radioactive labels, biotin, 

30 avidin, enzymes and fluorescent molecules. The derived 
labels can be incorporated into the primers prior to 
performing the amplif ication reaction. A preferred 
labelling procedure utilizes radiolabeled ATP and T4 
polynucleotide kinase (Sambrook, J. et al. (1989) in 

35 "Molecular Cloning, A Laboratory Manual", Cold Spring 
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Harbor Press, Plainview, NY) . Alternatively, the desired 
label can be incorporated into the primer extension 
products during the amplif ication reaction in the form of 
one or more labelled dNTPs. In the present invention, the 
labelled amplif ied PCR products can be detected by agarose 
gel electrophoresis followed by ethidum bromide staining 
and visualization under ultraviolet light or via direct 
sequencing of the PCR-products. 

In yet another embodiment, unlabelled 
amplif ication products can be detected via hybridization 
with labelled nucleic acid probes radioactively labelled 
or, labelled with biotin, in methods known to one skilled 
in the art such as dot and slot blot hybridization 
(Kafatos, F.C. et al. (1979) or filter hybridization 
(Hollander, M.C. et al. (1990)). 
15 In one embodiment, the nucleic acid sequences 

used as probes are selected from, and substantially 
homologous to, SEQ ID NOs:l-51. Such probes are useful as 
universal probes in that they can detect in PCR- 
amplification products of El cDNAs of an HCV isolate 
20 belonging to any of the twelve HCV genotypes disclosed 

herein. The size of these probes can range from about 200 
to about 500 nucleotides. 

In an alternative embodiment, the present 
invention relates to a method for determining the genotype 
25 of a hepatitis C virus present in a mammal where said 
method comprises; 

(a) amplifying RNA of a mammal via RT-PCR to 
produce amplification products; 

(b) contacting said products with at least one 
30 genotype -specific oligonucleotide; and 

(c) detecting complexes of said products which 
bind to said oligonucleotide (s) . 

In this method, one embodiment of said 
amplification step is carried out using the universal 
35 primers (SEQ ID NO: 103 through SEQ ID NO: 108) as disclosed 
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above. In step (b) of this method, the nucleic acid 
sequences used as probes are substantially homologous to 
the sequences shown in SEQ ID NOs:109-135. The probes 
disclosed in SEQ ID NOs: 109-135 are useful in specifically 
detecting PCR-amplif ication products of El cDNAs of HCV 
5 isolates belonging to one of the twelve HCV genotypes 

disclosed herein. In a preferred embodiment, probes having 
sequences substantially homologous to the sequences shown 
in SEQ ID NOs: 109 -135 are used alone or in combination with 
other probes specific to the same genotype. 

10 For example, a probe having a sequence according 

to SEQ ID NO: 109 can be used alone or in combination with a 
probe having a sequence according to SEQ ID NO: 110. The 
probes derived from SEQ ID NOs: 109 -135 can range in size 
from about 30 to about 70 nucleotides and can be 

15 synthesized as described earlier. 

The nucleic acid sequence used as a probe to 
detect PCR amplif ication products of the present invention 
can be labeled in single -stranded or double -stranded form. 
Labelling of the nucleic acid sequence can be carried out 

20 by techniques known to one skilled in the art. Such 

labelling techniques can include radiolabels and enzymes 
(Sambrook, J. et al. (1989) in "Molecular Cloning, A 
Laboratory Manual", Cold Spring Harbor Press, Plainview, 
New York) . In addition, there are known non- radioactive 

25 techniques for signal amplif ication including methods for 
attaching chemical moieties to pyrimidine and purine rings 
(Dale, R.N.K. et al. (1973) Proc. Natl, Acad. Sci. . 
70:2238-2242; Heck, R.F. (1968) S. Am. Chem. Soc. . 90:5518- 
5523) , methods which allow detection by chemiluminescence 

30 (Barton, S.K. et al. (1992) J. Am, Chem. Soc. . 114:8736- 
8740) and methods utilizing biotinylated nucleic acid 
probes (Johnson, T.K. et al. (1983) Anal. Biochem. . 
133:126-131; Erickson, P.F. et al. (1982) J. of Immunology 
Methods . 51:241-249; Matthaei, F.S. et al. (1986) Anal. 

35 Biochem, . 157:123-128) and methods which allow detection by 
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fluorescence using commercially available products. 

The present invention also relates to computer 
analysis of the amino acid sequences shown in SEQ ID 
NOs:52-l02 by the program GENALIGN. This analysis groups 
the 51 amino acid sequences shown in SEQ ID NOs: 52-102 into 
the twelve genotypes disclosed earlier in this application 
based upon the degree of variation of the amino acid 
sequences. For the purposes of the present invention, the 
amino acid sequence identity of El amino acid sequences of 
the same genotype ranges from about 85% to about 100% 
whereas the identity of El sequences of different genotypes 
ranges from about 45% to about 80%. 

The grouping of SEQ ID NOs: 52 -102 into the twelve 
HCV genotypes is shown below: 

15 SEQ ID NQs; Genotypes 

52-59 I/la 

60-76 Il/lb 

77-80 III/2a 

81-84 IV/2b 

85 2c 

86-90 V/3a 

20 9 1 4a 

92 4b 

93-94 4C 

95 4d 

96-101 5a 

102 6a 

25 For those genotypes containing more than one El 

amino acid sequence, computer alignment of the constituent 
sequences of each genotype was conducted using the computer 
program GENALIGN in order to produce a consensus sequence 
for each genotype. These alignments and their resultant 
consensus sequences are shown in Figures 2A-G for the seven 
genotypes (I/la, Il/lb, III/2a, IV/2b, V/3a, 4c and 5a) 
which comprise more than one sequence. Further alignment 
of the consensus sequences shown in Figures 2A-G with the 
amino acid sequences of SEQ ID NO: 85 (genotype 2c); SEQ ID 
35 NO: 91 (genotype 4a); SEQ ID N0:92 (genotype 4b); SEQ ID 
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NO:95 (genotype 4d) and SEQ ID NO: 102 (genotype 6a) to 
produce a consensus amino acid sequence for all twelve 
genotypes is shown in Figure 2H. The multiple alignment of 
El amino acid sequences shown in Figures 2A-H serves to 
highlight regions of homology and non- homology between 
5 amino acid sequences and hence, these alignments can 
readily be used by one skilled in the art to derive 
peptides useful in assays and vaccines for the diagnosis 
and prevention of HCV infection. Examples of purified and 
isolated peptides are provided by the present invention are 

10 shown as SEQ ID NOs: 136-159* These peptides are derived 
from two regions of the amino acid sequences shown in 
Figures 2A-H, amino acids 48-80 and amino acids 138-160. 
The peptides shown in SEQ ID NOs:136-159 are useful as 
genotype -specific diagnostic reagents since they are 

15 capable of detecting an immune response specific to HCV 
isolates belonging to a single genotype. The genotype- 
specificity of the peptides shown in SEQ ID NOs: 136-159 are 
as follows: SEQ ID NOs: 136 and 148 are specific for 
genotype IV/2b; SEQ ID NOs: 137 and 149 are specific for 

20 genotype 2c; SEQ ID NOs: 138 and 150 are specific for 

genotype III/2a; SEQ ID NOs: 139 and 151 are specific for 
genotype V/a; SEQ ID NOs: 140 and 152 are specific for 
genotype Il/lb; SEQ ID NOs: 141 and 153 are specific for 
genotype I/la; SEQ ID NOs: 142 and 154 are specific for 

25 genotype 4a; SEQ ID NOs: 143 and 155 are specific for 
genotype 4c; SEQ ID NOs: 144 and 156 are specific for 
genotype 4d; SEQ ID NOs: 145 and 157 are specific for 
genotype 4b; SEQ ID NOs: 146 and 158 are spiecific for 
genotype 5a and SEQ ID NOs: 147 and 159 are specific for 

30 genotype 6a. In SEQ ID NO: 136, Xaa at position 22 is a 

residue of Ala or Thr, Xaa at position 24 is a residue of 
Val or lie, Xaa at position 26 is a residue of Val or Met; 
in SEQ ID NO: 138, Xaa at position 5 is a Ser or Thr 
residue, Xaa at position 11 is an Arg or Gin residue, Xaa 

35 at position 12 is an Arg or Gin residue; in SEQ ID NO: 139, 
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Xaa at position 3 is a Pro or Ser residue, Xaa at position 
33 is a Leu or Met residue; in SEQ ID NO:140, Xaa at 
position 5 is a Thr or Ala residue, Xaa at position 13 is a 
Gly, Ala, Ser, Val or Thr residue, Xaa at position 14 is a 
Ser, Thr or Asn residue, Xaa at position 15 is a Val or lie 
residue, Xaa at position 16 is a Pro or Ser residue, Xaa at 
position 18 is a Thr or Lys residue, Xaa at position 19 is 
a Thr or Ala residue, Xaa at position 22 is an Arg or His 
residue, Xaa at position 32 is an Ala, Val or Thr residue; 
in SEQ ID NO: 141, Xaa at position 3 is an Ala or Pro 
residue, Xaa at position 4 is a Val or Met residue, Xaa at 
position 5 is a Thr or Ala residue, Xaa at position 17 is a 
Thr or Ala residue, Xaa at position 18 is a Thr or Ala 
residue, Xaa at position 23 is a His or Tyr residue; in SEQ 
ID NO: 143, Xaa at position 10 is a Val or Ala residue, Xaa 
15 at position 11 is a Ser or Pro residue, Xaa at position 18 
is an Asp or Glu residue Xaa at position 20 is a Leu or lie 
residue; in SEQ ID NO:146, Xaa at position 3 is a Gin or 
His residue, Xaa at position 12 is an Asn, Ser or Thr 
residue, Xaa at position 13 is a Leu or Phe residue, Xaa at 
20 position 23 is an Ala or Val residue; in SEQ ID NO: 148, Xaa 
at position 16 is a Val or Ala residue, Xaa at position 18 
is a Glu or Gin residue; in SEQ ID NO: 150, Xaa at position 
2 is an Ala or Thr residue, Xaa at position 4 is a Met or 
Leu residue, Xaa at position 9 is an Ala or Val residue, 
25 Xaa at position 17 is an He or Leu residue, Xaa at 

position 20 is an He or Val residue, Xaa at position 21 is 
a Ser or Gly residue; in SEQ ID NO: 151, Xaa at position 9 
is a Val or He residue, Xaa at position 16 is a Leu or Val 
residue, Xaa at position 20 is an He or Leu residue; in 
30 SEQ ID NO: 152, Xaa at position 2 is an Ala or Thr residue, 
Xaa at position 6 is a Val or Leu residue, Xaa at position 
12 is an He or Leu residue, Xaa at position 16 is a Val or 
He residue, Xaa at position 17 is a Val, Leu or Met 
residue, Xaa at position 19 is a Met or Val residue, Xaa at 
35 position 21 is an Ala or Thr residue; in SEQ ID NO: 153, Xaa 
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at position 2 is a Thr or Ala residue, Xaa at position 6 is 
a Val, lie or Met residue, Xaa at position 12 is an lie or 
Val residue, Xaa at position 16 is a lie or Val residue; in 
SEQ ID NO: 155, Xaa at position 5 is a Leu or Val residue, 
Xaa at position 21 is a Thr or Ala residue; in SEQ ID 
NO: 158, Xaa at position 1 is a Thr or Ala residue, Xaa at 
position 5 is a Val or Leu residue, Xaa at position 9 is a 
Leu, Met or Val residue, Xaa at position 23 is a Gly or Ala 
residue. 

Those skilled in the art would be aware that the 
peptides of the present invention or analogs thereof can be 
synthesized by automated instruments sold by a variety of 
manufacturers or can be commercially custom- ordered and 
prepared. The term analog has been described earlier in 
the specification and for purposes of describing the 
15 peptides of the present invention, analogs can further 

include branched or non- linear arrangements of the peptide 
sequences shown in SEQ ID NOs:136-159. 

Alternatively, peptides can be expressed from 
nucleic acid sequences where such sequences can be DNA, 
20 cDNA, RNA or any variant thereof which is capable of 
directing protein synthesis. In one embodiment, 
restriction digest fragments containing a coding sequence 
for a peptide can be inserted into a suitable expression 
vector that functions in prokaryotic or eukaryotic cells. 
25 Such restriction digest fragments may be obtained from 
clones isolated from prokaryotic or eukaryotic sources 
which encode the peptide sequence. 

Suitable expression vectors and methods of 
isolating clones encoding the peptide sequences of the 
30 present invention have previously been described. 

The preferred size of the peptides of the present 
invention is from about 8 to about* 100 amino acids in 
length. 

The present invention further relates to the use 
35 of the peptides shown in SEQ ID NOs: 136-159 in methods of 
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detecting antibodies specific for HCV in biological 
samples. In one embodiment, at least one peptide specific 
for a single genotype to be used in previously described 
immunoassays to detect antibodies specific for a single 
genotype of HCV. A preferred immunoassay is ELISA. 

It is understood by one skilled in the art that 
the diagnostic assays described herein using genotype- 
specific oligonucleotides or genotype- specific peptides ca 
be useful in assisting one skilled in the art to choose a 
course of therapy for the HCV- infected individual. 

In an alternative embodiment, a mixture of 
peptides can be used in an immunoassay to detect antibodies 
to any of the twelve genotypes of HCV. The mixture of 
peptides as disclosed herein, comprises at least one 
peptide selected from SEQ ID N0s:140-141 and 152-153; one 
peptide selected from SEQ ID N0s:136, 138, 148 and 150; one 
peptide selected from SEQ ID N0s:142-145 and 154-157; one 
peptide selected from SEQ ID N0s:146 and 158; one peptide 
selected from SEQ ID NOs:139 and 151; one peptide selected 
from SEQ ID NOs:138 and 150 and one peptide selected from 
SEQ ID NOs:140 and 159. In a preferred embodiment, the 
peptides of the present invention can be used in an ELISA 
assay as described previously for El proteins. 

The peptides or analogs thereof may be prepared 
in the form of a kit, alone or in combinations with other 
reagents such as secondary antibodies, for use in 
immunoassay. In addition, since genotype -specific peptides 
shown in SEQ ID NOs:136-159 are derived from two variable 
regions in the El protein, amino acids 48-80 (SEQ ID 
NOs:136-147) and amino acids 138-160 (SEQ ID N0s:148-159) , 
one skilled in the art would recognize that these peptides 
would be useful as vaccines against hepatitis C. In the 
present invention, a peptide from SEQ ID NOs:136-159 can be 
used alone or in combination with other peptides shown 
therein as immunogens in the vaccine. Formulations 
35 suitable for administering the peptide (s) of the present 
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invention, routes of administration, pharmaceutical 
compositions comprising the peptides and so forth are the 
same as those previously described for recombinant El 
proteins. In addition, as described for El proteins, the 
peptide (s) can also be used to prepare antibodies to HCV-E1 
protein. 

The peptides of the present invention may also be 
supplied in the form of a kit, alone, or in the form of a 
pharmaceutical composition as described above for El 
proteins recombinant. 

Any articles or patents referenced herein are 
incorporated by reference. The following examples 
illustrate various aspects of the invention but are in no 
way intended to limit the scope thereof. 
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MATERIALS 

Serum used in these examples was obtained from 
84 anti-HCV positive individuals that were previously found 
to be positive for HCV RNA in a cDNA PGR assay with primer 
set a from the 5' NC region of the HCV genome (Bukh, J. et 
5 al. (1992 (b)) Natl. Acad, Sci. USA 89:4942-4946). These 
samples were from 12 countries: Denmark (DK) ; Dominican 
Republic (DR) ; Germany (D) ; Hong Kong (HK) ; India (IND) ; 
Sardinia, Italy (S) ; Peru (P) ; South Africa (SA) ; Sweden 
(SW) ; Taiwan (T) ; United States (US) ; and Zaire (Z) . 

10 

Example 1 

Identification of the DNA Sequence 
of the El Gene of 51 Isolates of HCV via 
RT-PCR Analysis of viral RNA Using Universal Primers 

!5 Viral RNA was extracted from 100 fil of serum by 

the guanidinium- phenol- chloroform method and the final RNA 
solution was divided into 10 equal aliquots and stored at 
-80°C as described (Bukh, et al. (1992 (a)). The sequences 
of the synthetic oligonucleotides used in the RT-PCR assay, 

20 deduced from the sequence of HCV strain H-77 (Ogata, N. et 
al. (1991) Proc. Natl. Acad. Sci. USA 88:3392-3396), are 
shown as SEQ ID NOs:103-108. One aliquot of the final RNA 
solution, equivalent to 10 fil of serum, was used for cDNA 
synthesis that was performed in a 20 ^1 reaction mixture 

25 using avian myeloblastosis virus reverse transcriptase 

(Promega, Madison, WI) and SEQ ID NO: 104 as a primer. The 
resulting cDNA was amplified in a "nested" PGR assay by Tag 
DNA polymerase (Amplitaq, Perkin-Elmer/Cetus) as described 
previously (Bukh et al. (1992a)) with primer set e (SEQ ID 

30 NOs:103-106) . Precautions were taken to avoid 

contamination with exogenous HCV nucleic acid (Bukh et al. 
1992a)), and negative controls (normal, uninfected serum) 
were interspersed between every test sample in both the RNA 
extraction and cDNA PCR procedures. No false positive 

35 results were observed in the analysis. In most instances, 
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amplified DNA (first or second PCR products) was 
reamplif ied with primers SEQ ID NO: 107 and SEQ ID NO: 108 
prior to sequencing since these two primers contained BcoRl 
sites which would facilitate future cloning of the El gene. 
Amplified DNA was purified by gel electrophoresis followed 
5 by glass-milk extraction (Geneclean, BIO 101, LaJolla, CA) 
and both strands were sequenced directly by the dideoxy- 
nucleotide chain termination method (Bachman, B. et al. 
(1990) Nucl. Acids Res. 18:1309)) with phage T7 DNA 
polymerase (Sequenase, United States Biochemicals, 

10 Cleveland, OH), [alpha 35 S]dATP (Amersham, Arlington 
Heights, IL) or [alpha ^P] dATP (Amersham or DuPont, 
Wilmington, DE) and sequencing primers. RNA extracted from 
serum containing HCV strain H-77, previously sequenced by 
Ogata, N. et al. (1991), was amplified with primer set e 

15 (SEQ ID NOs:103-106) and sequenced in parallel as a 

control. The nucleotide sequences of the envelope 1 (El) 
gene of all 51 HCV isolates are shown as SEQ ID NOs:l - 51. 
In all 51 HCV isolates, the El gene was exactly 576 
nucleotides in length and did not have any in- frame stop 

20 codons . 



Example 2 

Computer Analysis of the Nucleotide 
and Deduced Amino Acid Sequences 
of the El Gene of the 51 HCV Isolates 

25 

Multiple computer- generated alignments of the 
nucleotide (SEQ ID NOs:l-51, Figures 1A-H) and deduced 
amino acid sequences (SEQ ID NOs:52-102, Figures 2A-H) of 
the cDNAs of the 51 HCV isolates constructed using the 
30 computer program GENALIGN (Miller, R.H. et al. (1990) Proc. 
Natl. Acad. Sci. USA 87:2057-2061) resulted in the 51 HCV 
isolates being divided into twelve genotypes based upon the 
degree of variation of the El gene sequence as shown in 
table 1. 
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The nucleotide and amino acid sequence identity 
of HCV isolates of the same genotype was in the range of 
88.0-99 .1% and 89 •1-98.4%, respectively, whereas that of 
HCV isolates of different genotypes was in the range of 
53.5-78.6% and 49.0-82.8%, respectively. The latter 
5 differences are similar to those found when comparing the 
envelope gene sequences of the various serotypes of the 
related flavi viruses, as well as other RNA viruses. When 
microheterogenicity in a sequence was observed, defined as 
more than one prominent nucleotide at a specific position, 

10 the nucleotide that was identical to that of the HCV 
prototype (HCV1, Choo et al. (1989)) was reported if 
possible. Alternatively, the nucleotide that was identical 
to the most closely related isolate is shown. 

Analysis of the consensus sequence of the El 

15 protein of the 51 HCV isolates from this study demonstrated 
that a total of 60 (30.3%) of the 192 amino acids of the El 
protein were invariant among these isolates (Fig. 3) . Most 
impressive, all 8 cysteine residues as well as 6 of 8 
proline residues were invariant. The most abundant amino 

20 acids (e.g. alanine, valine and leucine) showed a very low 
degree of conservation. The consensus sequence of the El 
protein contained 5 potential N- linked glycosylation sites. 
Three sites at positions 209, 305 and 325 were maintained 
in all 51 HCV isolates. A site at position 196 was 

25 maintained in all isolates except the sole isolate of 

genotype 2c. Also, a site at position 234 was maintained 
in all isolates except one isolate of genotype I/la, all 
four isolates of genotype IV/2b and the sole isolate of 
genotype 6a. Conversely, only genotype IV/2b isolates had 

30 a potential glycosylation site at position 233. Further 

analysis revealed a highly conserved amino acid domain (aa 
302-328) in the El protein with 20 (74.1%) of 27 amino 
acids invariant among all 51 HCV isolates. It is possible 
that the 5' and 3' ends of this domain are conserved due to 

35 important cysteine residues and N- linked glycosylation 
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sites. The central sequence, 5 ' -GHRMAWDMM- 3 ' (aa 315-323), 
may be conserved due to additional functional constraints 
on the protein structure. Finally, although the amino acid 
sequence surrounding the putative El protein cleavage site 
was variable, an amino acid doublet (GV) at position 380 
5 was invariant among all HCV isolates. 

A dendrogram of the genetic relatedness of the El 
protein of selected HCV isolates representing the 12 
genotypes is shown in Fig. 4. This dendrogram was 
constructed using the program CLUSTAL (Weiner, A.J. et al. 
10 (1991) Virology 180:842-848) and had a limit of 25 

sequences. The scale showing percent identity was added 
based upon manual calculation. From the 51 HCV isolates 
for which the complete sequence of the El gene region was 
obtained, 25 isolates representing the twelve genotypes 

15 were selected for analysis as follows. Among isolates with 
genotype I/la (SEQ ID N0s:52-59), as well as among isolates 
with genotype Il/lb (SEQ ID NOs: 60-76) the two isolates 
with the lowest amino acid identity within each genotype 
were included. Among isolates of genotype IV/2b, isolate 

20 DK8 (SEQ ID NO: 81) that has an amino acid identity of 96.4% 
to isolate T8 (SEQ ID NO: 84) was excluded. Among isolates 
of genotype V/3a, isolates S2 (SEQ ID NO: 88) and S54 (SEQ 
ID NO: 90) that both shared 97.9 % of the amino acids of 
isolates HK10 (SEQ ID NO: 87) and S52 (SEQ ID NO: 89) were 

25 excluded. Finally, among isolates of genotype VI, isolates 
SA4 (SEQ ID NO: 9 7) and SA5 (SEQ ID NO: 98) with an amino 
acid identity to isolate SA7 (SEQ ID NO: 100) of 96.4% and 
95.8%, respectively were excluded. This dendrogram in 
combination with the analysis of the El gene sequence of 51 

30 HCV isolates in Table l demonstrates extensive 
heterogeneity of this important gene. 

The worldwide distribution of the 12 genotypes 
among 74 HCV isolates is depicted in Fig. 5. The complete 
El gene sequence was determined in 51 of these HCV isolates 

35 (SEQ ID NOs: 1-51), including 8 isolates of genotype I/la, 
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17 isolates of genotype Il/lb and 26 isolates comprising 
genotypes III/2a, IV/2b, 2c, 3a, 4a-4d, 5a and 6a. In the 
remaining 23 isolates, all of genotypes I/la and Il/lb, the 
genotype assignment was based on a partial El gene sequence 
since they did not represent additional genotypes in any of 
5 the 12 countries. The number of isolates of a particular 
genotype is given in each of the 12 countries studied. Of 
the twelve genotypes, genotypes I/la and Il/lb were the 
most common accounting for 48 (65%) of the 74 isolates. 
Analysis of the El gene sequences available in the GenBank 

10 data base at the time of this study revealed that all 44 
such sequences were of genotypes I/la, Il/lb, III/2a and 
IV/2b. Thus, based upon El gene analysis, 8 new genotypes 
of HCV have been identified. 

Also of interest, different HCV genotypes were 

15 frequently found in the same country, with the highest 

number of genotypes (five) being detected in Denmark. Of 
the twelve genotypes, genotypes I/la, Il/lb, III/2a, IV/2b 
- and V/3a were widely distributed with genotype Il/lb being 
identified in 11 of 12 countries studied (Zaire was the 

20 only exception) . In addition, while genotypes I/la and 
Il/lb were predominant in the Americas, Europe and Asia, 
several new genotypes were predominant in Africa. 

It was also found that genotypes I/la, Il/lb, 
III/2a, IV/2b and V/3a of HCV were widely distributed 

25 around the world, whereas genotypes 2c, 4a, 4b, 4d, 5a and 
6a were identified only in discreet geographical regions. 
For example, the majority of isolates in South Africa 
comprised a new genotype (5a) and all isolates in Zaire 
comprised 3 new closely related genotypes (4a, 4b, 4c) . 

30 These genotypes were not identified outside Africa. 

Example 3 

Detection by ELISA Based on Antigen from 
Insect Cells Expressing Complete El Protein 

35 Expression of El protein in SF9 cells . A cDNA 
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10 



15 



20 



(SEQ ID NO:l) encoding the complete El protein of SEQ ID 
NO: 52 is subcloned into pBlueBac - Transfer vector 

(Invitrogen) using standard subcloning procedures. The 
resultant recombinant expression vector is cotransfected 
into SP9 insect cells (Invitrogen) by the Ca precipitation 
method according to the Invitrogen protocol. 

ELISA Based on T nfected SF9 cells . 5 x 10 6 SF9 

cells infected with the above -described recombinant 
expression vector are resuspended in l ml of 10 mM Tris- 
HC1, pH 7.5, 0.15M NaCl and are then frozen and thawed 3 
times. 10 ul of this suspension is dissolved in 10 ml of 
carbonate buffer (pH 9.6) and used to cover one flexible 
microtiter assay plate (Falcon) . Serum samples are diluted 
1:20, 1:400 and 1:8000, or 1:100, 1:1000 and 1:10000. 
Blocking and washing solutions for use in the ELISA assay 
are PBS containing 10% fetal calf serum and 0.5% gelatin 
(blocking solution) and PBS with 0.05% Tween -20 (Sigma, 
St. Louis, MO) (washing solution). As a secondary antibody, 
peroxidase- conjugated goat IgG fraction to human IgG or 
horse radish peroxidase- labelled goat anti-Old or anti-New 
World monkey immunoglobulin is used. The results are 
determined by measuring the optical density (O.D.) at 405 



run. 



To determine if insect cells -derived El protein 
representing genotype I/a of HCV could detect anti-HCV 
25 antibody in chimpanzees infected with genotype I/la of HCV, 
three infected chimpanzees are examined. The serum of all 
3 chimpanzees are found to seroconvert to anti-HCV. 



30 



Example 4 

Use of the Complete 
El Protein as a Vaccine 



Mammals are immunized with purified or partially 
purified El protein in an amount sufficient to stimulate 
the production of protective antibodies. The immunized 
35 mammals challenged with various genotypes of HCV are 
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° protected. 

It is understood by one skilled in the art that 
the recombinant El protein used in the above vaccine can 
also be used in combination with other recombinant El 
proteins having an amino acid sequence shown in SEQ ID 
5 NOs:52-102. 



10 



15 



20 



Example 5 

Determination of the Genotype of an HCV 
Isolate Via Hybridization of Genotype-Specific 
Oligonucleotides to RT-PCR Amplification Products. 

Viral RNA is isolated from serum obtained from a 
mammal and is subjected to RT-PCR as in Example 1. 
Following amplification, the amplified DNA is purified as 
described in Example 1 and aliquots of 100 mg of 
amplification product are applied to twelve dots on a 
nitrocellulose filter set in a dot blot apparatus. The 
twelve dots are then cut into separate dots and each dot is 
hybridized to a ^-labelled oligonucleotide specific for a 
single genotype of HCV. The oligonucleotides to be used as 
hybridization probes are selected from SEQ ID NOs:109-135. 

Example 6 

ELISA Based on Synthetic 
Peptides Derived From El cDNA Sequences 

2^ Synthetic peptides specific for genotype I/la and 

having amino acid sequences according to SEQ ID NOs:136-148 
are placed in 0.1% PBS buffer and 50ul of lmg/ml of peptide 
is used to cover each well of the microtiter assay plate. 
Serum samples from two mammals infected with genotype I/la 
HCV and from one mammal infected with genotype 5a HCV are 
diluted as in Example 3 and the ELISA is carried out as in 
Example 3. Both mammals infected with genotype I HCV react 
positively with peptides while the mammal infected with 
genotype 5a HCV exhibit no reactivity. 



30 
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Example 7 
Use of the El Peptides as a Vaccine 

Since the El genotype -specific peptides of the 
present invention are derived from two variable regions in 
the complete El protein, there exists support for the use 
of these peptides as a vaccine to protect against a variety 
of HCV genotypes. Mammals are immunized with peptide (s) 
selected from SEQ ID NOs: 136-159 in an amount sufficient 
to stimulate production of protective antibodies ♦ The 
immunized mammals challenged with various genotypes of HCV 
are protected. 
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(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK7 



5 





(xi) 


SEQUENCE 




TAC 

AAV 


CAA 


GTG 


CGC 


AAC 


TCC 






GAT 


TGC 


CCT 


AAC 


TCG 


Atari 


GAT 


GCC 


ATC 


CTG 


CAC 


ACT 


CCG 


CGC 


GAG 


GGT 


AAC 


GTC 


TCG 


AGG 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAT 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


ACC 


CTC 


TGT 


TCG 


GCC 


CTC 


TAC 


TCT 


GTC 


TTT 


CTT 


GTC 


GGT 


CAA 


AGG 


CGC 


CAC 


TGG 


ACG 


ACG 


CAA 


TAT 


CCT 


GGC 


CAT 


ATA 


ACG 


GGT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCG 


ATC 


GCT 


GGT 


GCT 


CAC 


TGG 


GGA 


TAT 


TTT 


TCC 


ATG 


GTG 


GGG 


AAC 


GTG 


CTG 


CTG 


CTA 


TTT 


GCC 


GGC 



: SEQ ID NO:l: 



GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


ATC 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


TGT 


TGG 


GTG 


GCG 


ATG 


ACC 


156 


GGC 


AAA 


CTC 


CCC 


ACA 


GCG 


195 


CTG 


CTC 


GTC 


GGG 


AGT 


GCC 


234 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


CTG 


TTT 


ACC 


TTC 


TCT 


CCC 


312 


GGC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


CAC 


CGC 


ATG 


GCG 


TGG 


GAT 


390 


ACC 


ACG 


GCG 


TTG 


GTA 


GTA 


429 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


GTC 


CTG 


GCG 


GGC 


ATA 


GCG 


507 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


546 


GTC 


GAC 


GCG 








576 



(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



) 25 TAC CAA GTA CGC AAC TCC TCG GGC CTC TAC CAT GTC ACC 39 

AAT GAT TGC CCT AAC TCG AGT ATT GTG TAC GAG GCG GCC 78 

GAT GCC ATC CTG CAT TCT CCA GGG TGT GTC CCT TGC GTT 117 

CGC GAG GGT AAC GCC TCG AAA TGT TGG GTG GCG GTG GCC 156 

CCC ACG GTG GCC ACC AGG GAC GGC AAG CTC CCC GCA ACG 195 

CAG CTT CGA CGT CAC ATC GAT CTG CTT GTC GGG AGC GCC 234 

ACC CTC TGC TCG GCC CTC TAT GTG GGG GAC TTG TGC GGG 273 

TCT GTC TTC CTT GTC GGC CAA CTG TTC ACC TTC TCC CCC 312 

30 AGA CGC CAC TGG ACA ACG CAA GAC TGC AAC TGT TCT ATC 351 

TAC CCC GGC CAT ATT ACG GGT CAT CGC ATG GCG TGG GAT 390 

ATG ATG ATG AAC TGG TCC CCT ACA GCA GCG CTG GTA ATG 429 

GCG CAG CTG CTC AGG ATC CCG CAG GCC ATC TTG GAC ATG 468 

ATC GCT GGT GCC CAC TGG GGA GTC CTA GCG GGC ATA GCG 507 

TAT TTC TCC ATG GTG GGG AAC TGG GCG AAG GTC GTG GTG 546 

GTA CTG TTG CTG TTT ACC GGC GTC GAT GCG 576 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



CAC 


CAA 


GTG 


CGC 


AAC 


TCT 


ACA 


GGG 


CTT 


TAC 


CAT 


AAT 


GAT 


TGC 


CCT 


AAT 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GAT 


GCC 


ATC 


CTG 


CAC 


GOG 


CCG 


GGG 


TGT 


GTC 


CCT 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


CAG 


CTT 


CGA 


GGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TCT 


GTC 


TTC 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTT 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGA 


CAC 


CGT 


ATG 


GCA 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACA 


GCG 


CTG 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


ATC 


GCT 


GGA 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 



(2) INFORMATION FOR SEQ ID NO: 4 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 



(xi) 



39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
576 



SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:4: 




CGC 


AAC 


TCT ACA GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


CCT 


AAT 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


CTG 


CAC 


ACG 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


AAC 


ACC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


GTG 


ACC 


156 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


195 


CGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


TCT 


CCC 


312 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCC ATC 


351 
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TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGC 


CAC 


CGC 


ATG 


GCG 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACA 


GCG 


CTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GGG 


AAG 


GTC 


CTG 


GTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 








576 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TAC CAA GTG CGC AAC TCC ACG GGG CTT TAC CAT GTT ACC 39 

AAT GAT TGC CCT AAC TCG AGT ATT GTG TAC GAG ACA GCT 78 

GAT GCT ATC CTA CAC GCT CCG GGA TGT GTC CCT TGC GTT 117 

CGT GAG GGT AAC ACC TCG AGG TGT TGG GTG GCG ATG ACC 156 

CCC ACG GTG GCC ACC AGG GAC GGC AAA CTC CCC GCA ACG 195 

CAG CTT CGA CGT TAC ATC GAT CTG CTT GTC GGG AGC GCC 234 

ACC CTC TGT TCG GCC CTC TAC GTG GGG GAC TTG TGC GGG 273 

TCT GTC TTT CTT GTC GGT CAG CTG TTT ACC TTC TCT CCC 312 

AGG CGC CTC TGG ACG ACG CAA GAC TGC AAT TGT TCT ATC 351 

TAT CCC GGC CAT ATA ACG GGT CAT CGC ATG GCA TGG GAT 390 

ATG ATG ATG AAC TGG TCC CCT ACG ACG GCA CTG GTA GTA 429 

GCT CAG CTG CTC CGG ATC CCA CAA GCC ATC TTG GAT ATG 468 

ATC GCT GGT GCT CAC TGG GGA GTC CTA GCG GGC ATA GCG 507 

TAT TTC TCC ATG GTG GGA AAC TGG GCG AAG GTC CTA GTG 546 

GTG CTG CTG CTA TTC GCC GGC GTT GAC GCG 576 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SI 8 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

TAC CAA GTA CGC AAC TCC ACG GGC CTT TAC CAT GTC ACC 



39 
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AAT 


GAC 


TGC 


CCT 


AAC 


TCG 




" o*± 


oILj 


1AL 




ALb 


* 


78 


GAT 

W*A JU 


ACC 


ATC 


PTA 




TPT 

X wX 


PPfl 


WW 


lvji 


bit 


LL1 




(all 


117 


CGC 


GAG 


GGT 


AAP 


RPP 


TPC 




rrv-irp 


l\its 


GTG 


/-i /-i /-i 

CCG 


GTG 


GCC 


156 


ppp 


APA 

AWl 




f^PP 


ALL 




laAL. 


GGC 


* 7V 7\ 

AAA 


CTC 


CCC 


ft ft ^ 
GCA 


ACG 


195 


PAG 


PTT 
wx a 


ptia 




par 1 


AIL 


GAT 


CTG 


CTT 


GTT 


ft ft ft 

GGG 


AGC 


GCC 


234 


APP 


PTP 


X V7l~ 




IjLL 


Lit 


TAT 


GTG 


GGG 


ft -yt ft 

GAC 


CTG 


wnft ft 

TGC 


GGG 


273 


TCT 


GTC 


mmrp 

X <A X 


PTT 


CTP 

V7X W 


Af2P 


CAG 


CTG 


1 1L 




A1C 


TCC 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAC 


TGT 


TCT 


ATC 


351 


TAC 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


ACG 


GCG 


TTG 


GTA 


ATA 


429 


GCT 


CAG 


CTG 


CTC 


AGG 


GTC 


CCG 


CAA 


GCC 


GTC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GCG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


CTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTC 


GAT 


GCG 








576 
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(2) INFORMATION FOR SEQ 3D NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
15 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



25 



TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


TCG 


GGC 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


ACG 


GCC 


78 


GAT 


GCC 


ATT 


CTA 


CAC 


TCT 


CCA 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GAT 


GGC 


GCC 


CCG 


AAG 


TGT 


TGG 


GTG 


GCG 


GTG 


GCC 


156 


CCC 


ACA 


GTC 


GCC 


ACT 


AGG 


GAC 


GGC 


AAA 


CTC 


CCT 


GCA 


ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGA 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TGG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTC 


GTC 


AGT 


CAA 


CTG 


TTC 


ACG 


TTC 


TCC 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGT 


AAC 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


ATA 


ACG 


GGT 


CAC 


CGC 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCC 


ACA 


ACA 


GCG 


CTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


AGG 


ATC 


CCG 


CAA 


GCC 


GTC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


ATA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


TCC 


GGC 


GTC 


GAT 


GCG 
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30 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



WO 95/01442 PCT/US94/07320 



- 55 - 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
{C) INDIVIDUAL ISOLATE: US11 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


ACG 


GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


ACT 


CCG 


GGG 


TGT 


GTT 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCT 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


ATG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


195 


CAA 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGT 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


ttt 


CTT 


GTC 


GGT 


CAA 


CTG 


TTT 


ACC 


TTC 


TCT 


CCC 


312 


AGA 


CGC 


CAC 


TGG 


ACG 


ACG 


CAG 


GGC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGC 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


GCG 


GCG 


TTG 


GTG 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCT 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


546 


GTG 


CTG 


CTG 


CTA 


TTT 


GCC 


GGC 


GTC 


GAC 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 9: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Dl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



TAT GAA GTG CGC AAC GTG TCC GGG GTG TAC CAT GTC ACG 39 

2<r AAC GAC TGT TCC AAC TCG AGC ATT GTG TAT GAG ACA GCG 78 

GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT 117 

CGG GAG GAC AAC TCC TCT CGC TGC TGG GTA GCG CTC ACC 156 

CCC ACG CTC GCG GCT AGG AAT GGC AAC GTC CCC ACT ACG 195 

GCG ATA CGA CGC CAC GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCC ATG TAC GTG GGG GAT CTC TGC GGA 273 

TCT GTT TTC CTC ATC TCC CAG CTG TTC ACC CTC TCG CCT 312 

CGC CGG CAT GAG ACG GTA CAG GAG TGT AAT TGC TCA ATC 351 

30 TAT CCC GGC CAC GTG ACA GGT CAC CGT ATG GCT TGG GAT 390 

ATG ATG ATG AAC TGG TCA CCT ACA ACA GCC TTA GTG GTA 429 

TCG CAG TTA CTC CGG ATC CCA CAA GCT GTC ATG GAC ATG 468 

GTG GCG GGG GCC CAC TGG GGG GTC CTG GCG GGC CTC GCC 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT 546 

GTG ATG CTA CTC TTT GCT GGC GTT GAC GGC 576 



35 



(2) INFORMATION FOR SEQ ID NO: 10 



WO 95/01442 



PCT7US94/07320 



- 56 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

5 (A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: D3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TAT GAA GTG CGC AAC GTG TCC GGG GTG TAC CAA GTC ACC 39 

AAT GAC TGT TCC AAC TCG AGC ATC GTG TAT GAG ACA GCG 78 

GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT 117 

10 CGG GAG GAC AAC TCC TCT CGC TGC TGG GTA GCG CTC ACC 156 

CCC ACG CTC GCG GCT AGG AAT AGC AGC GTC CCC ACT ACG 195 

ACA ATA CGA CGC CAC GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCC ATG TAC GTG GGG GAT CTT TGC GGA 273 

TCT GTT TTC CTC GTC TCC CAG CTG TTC ACC TTC TCG CCT 312 

CGC CGG CAT GAG ACA GTA CAG GAA TGT AAC TGC TCA ATC 351 

TAT CCC GGC CAC GTG ACA GGT CAC CGC ATG GCT TGG GAT 390 

15 ATG ATG ATG AAC TGG TCG CCT ACA GCA GCC CTA GTG GTA 429 

TCG CAG TTA CTC CGG ATC CCA CAA GCT GTC GTG GAC ATG 468 

GTG GCG GGG GCC CAC TGG GGG GTC CTG GCG GGC CTC GCC 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT 546 

GTG ATG CTA CTC TTT GCT GGC GTC GAC GGC 576 



(2) INFORMATION FOR SEQ ID NO: 11: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



TAT GAA GTG CGC AAC GTG TCC GGG GTG TAC CAC GTC ACA 39 

AAC GAC TGC TCC AAC TCA AGC ATC GTG TAT GAG GCA GTG 78 

30 GAC GTG ATC ATG CAT ACC CCA GGG TGC GTG CCC TGC GTT 117 

CGG GAG AAC AAC CAC TCC CGT TGC TGG GTA GCG CTC ACC 156 

CCC ACG CTC GCG GCC AGG AAC GCC AGC ATC CCC ACT ACG 195 

ACA ATA CGA CGC CAT GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCT ATG TAC GTG GGG GAC CTC TGC GGA 273 

TCC GTT TTC CTC GTC TCT CAG CTG TTC ACC TTT TCA CCT 312 

CGC CGG CAT GAG ACA GCA CAG GAC TGC AAC TGC TCA ATC 351 

35 TAT CCC GGC CAC GTT TCA GGT CAC CGC ATG GCT TGG GAT 390 

ATG ATG ATG AAC TGG TCA CCT ACA ACA GCC CTA GTG CTA 429 
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TCG CAG TTA CTC CGA ATC CCA CAA GCT GTC GTG GAC ATG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTC GCC 507 

TAC TAC TCC ATG GCG GGG AAC TGG GCC AAG GTT TTA ATT 546 

GTG TTG CTA CTC TTT GCC GGC GTT GAT GGG 576 

(2) INFORMATION FOR SEQ ID NO: 12: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



15 



20 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


GTC 


GTG 


TAT 


GAG 


ACA 


GCA 


78 


GAC 


ATG 


ATC 


ATG 


CAT 


ACC 


CCT 


GGA 


TGC 


GTG 


CCC 


TGC 


GTA 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGT 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GTC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCC 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


CTC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


GGA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAA 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTT 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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^ (2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

CAT GAA GTG CAC AAC GTA TCC GGG ATC TAC CAT GTC ACG 39 
3 - AAC GAC TGC TCC AAC TCA AGT ATT GTG TAT GAG GCA GCG 78 
GAC ATG ATC ATG CAT ACC CCC GGG TGC GTG CCC TGC GTC 117 



WO 95/01442 



PCT/US94/07320 





I3AV7 




7V 7V 


ICC 


rcc 


/ V 'III 

CGT 


• 58 
TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


LLL 


ACLj 


/-irn/-i 
CiC 


GCG 


GCC 


AGG 


AAC 


GCC 


A6C 


ATC 


CCC 


ACT 


ACG 


195 


ACA 


A1A 


CGA 


CGC 


CAT 


GTC 


GAC 


TTG 


CTC 


GTT 


m 

GGG 


GCG 


GCT 


234 


GCT 


ire 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGA 


GAT 


CTC 


TGC 


GGA 


273 


HP 


nnip 

V?1L. 




CTC 


GTC 


TCC 


CAG 


TTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGA 


CTC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


GCG 


GGA 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCT 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCC 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 14: 

10 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



TAT GAA GTG CGC AAC GTG TCC GGG GTA TAC CAT GTC ACG 39 

AAC GAC TGC TCC AAC TTA AGC ATC GTG TAC GAG ACA ACG 78 

20 GAC ATG ATC ATG CAC ACC CCT GGG TGC GTG CCC TGC GTT 117 

CGG GAA AAC AAC TCC TCC CGT TGT TGG GTA GCG CTC GCC 156 

CCC ACG CTC GCG GCC AGG AAC GCC AGC GTC CCC ACC ACG 195 

GCA ATA CGA CGC CAC GTC GAC TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCT ATG TAC GTG GGG GAT CTT TGC GGA 273 

TCT GTT TTC CTC GTC TCC CAG CTG TTC ACC TTC TCG CCT 312 

CGC CGA CAC GAG ACG GTA CAG GAC TGC AAC TGC TCA ATC 351 

25 TAT CCC GGC CAC GTA ACA GGT CAC CGC ATG GCT TGG GAT 390 

ATG ATG ATG AAC TGG TCA CCT ACA ACA GCC CTA GTG GTG 429 

TCG CAG TTA CTC CGG ATC CCG CAA GCT GTC GTG GAC ATG 468 

GTA GCG GGG GCC CAC TGG GGG GTC CTG GCG GGC CTT GCC 507 

TAC TAT TCC ATG GTG GGA AAC TGG GCT AAG GTT TTG ATT 546 

GTG ATG CTA CTT TTT GCC GGC GTT GAT GGG 576 



30 (2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (vi) ORIGINAL SOURCE: 



WO 95/01442 
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(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



10 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 

» ^^^^ 


GGG 


ATA 


TAP 






ALU 


1 o 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATC 


GTG 


TAT 


GAA 


ACA 


GCG 


78 


6AC 


ATG 


ATT 


ATG 


CAT 


ACC 


CCT 


GGA 


TGC 


ATG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGT 


TGC 


TGG 


GTG 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


GTC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA ATA 


CGA 


CGC 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTT 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


ATC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGC 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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15 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TAT GAA GTG CGC AAC GTG TCC GGG GTG TAC CAT GTC ACG 39 

25 AAC GAC TGC TCC AAC TCA AGT ATT GTG TAT GAG GCA GCG 78 

GAC ATG ATC ATG CAC ACT CCC GGG TGC GTG CCC TGC GTT 117 

CGG GAG GGC AAC TCC TCT CGC TGC TGG GTA GCG CTC ACT 156 

CCC ACT CTC GCG GCC AGG AAC GCC AGC GTC TCC ACC ACG 195 

ACA ATA CGA CAC CAC GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGT TCC GCT ATG TAC GTG GGG GAT CTA TGC GGA 273 

TCT GTT TTC CTC GTC TCC CAG CTG TTC ACC TTC TCA CCG 312 

CGC CGG CAT GAG ACA GTA CAG GAC TGC AAT TGC TCC ATC 351 

30 TAT CCC GGC CAC GTA TCA GGT CAC CGC ATG GCC TGG GAT 390 

ATG ATG ATG AAC TGG TCA CCT ACA GCA GCC CTA GTG GTA 429 

TCG CAG TTG CTC CGG ATC CCA CAA GCT GTC GTG GAT ATG 468 

GTG GCG GGG GCC CAC TGG GGA ATC CTG GCG GGC CTT GCC 507 

TAC TAT TCC ATG GTA GGG AAC TGG GCT AAG GTT TTG ATT 546 

GTG ATG CTA CTC TTT GCC GGC GTT GAC GGG 576 



35 



(2) INFORMATION FOR SEQ ID NO: 17 



WO 95/01442 PCT/US94/07320 



10 



15 



30 



- 60 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: hamosapiens 
(C) INDIVIDUAL ISOLATE: IND8 





(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO:: 


17: 




TAT 


GAG 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TTC 


TCT 


AGT 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACT 


CTC 


GCG 


GCT 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


ACA ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCG 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCG 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


ATC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTA 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 18: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: hamosapiens 
(C) INDIVIDUAL ISOLATE: P10 



35 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:: 


18: 




TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA GCG 


78 


GAC 


ATG 


ATA 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGT 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA GCG 


CTC 


ACT 


156 


CCC 


ACA 


CTC 


GCG 


GCT 


AGG 


AAT 


TCC 


AGC 


GTC 


CCA 


ACT 


ACG 


195 


GCA 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


CTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCT 


312 


CGC 


CGG 


CAT 


TGG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGT 


TCA ATC 


351 


TAT 


CCT 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


GCA 


GCC 


CTA 


GTG 


GTG 


429 



WO 95/01442 PCT/US94/07320 



TCG CAG CTA CTC CGG 
GTG GCG GGG GCC CAC 
TAC TAT TCC ATG GTG 
GTG ATG CTA CTC TTT 



- 61 - 

ATC CCA CAA GCT ATC 
TGG GGA GTC CTG GCG 
GGG AAC TGG GCT AAG 
GCC GGC GTT GAC GGA 



TTG GAT GTG 468 
GGC CTT GCC 507 
GTC TTG ATT 546 

576 



(2) INFORMATION FOR SEQ ID NO: 19: 

5 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TAT GAA GTG CGC AAC GTA TCC GGG GCG TAC CAT GTC ACG 39 

AAC GAC TGC TCC AAC TCA AGT ATT GTG TAC GAG GCA GCG 78 

15 GAC GTG ATC ATG CAT ACC CCC GGG TGT GTA CCC TGC GTT 117 

CAG GAG GGT AAC TCC TCC CAA TGC TGG GTG GCG CTC ACC 156 

CCC ACG CTC GCG GCC AGG AAC GCT ACC GTC CCC ACC ACG 195 

ACA ATA CGA CGT CAT GTC GAT TTG CTC GTT GGG GCG GCT 234 

GTT TTC TGC TCC GCT ATG TAC GTG GGG GAC CTG TGC GGA 273 

TCT GTT TTC CTC ATC TCC CAG CTG TTC ACC ATC TCG CCC 312 

CGT CGG CAT GAG ACA GTA CAG AAC TGC AAT TGC TCA ATC 351 

TAT CCC GGA CAC GTG ACA GGT CAT CGC ATG GCC TGG GAT 390 

20 ATG ATG ATG AAC TGG TCG CCT ACA ACA GCC CTA GTG GTA 429 

TCG CAG CTA CTC CGG ATC CCA CAA GCT GTC ATG GAT ATG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTC GCC 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT 546 

GTG ATG CTA CTT TTT GCT GGT GTT GAC GGG 576 



25 (2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 



35 



TAT GAA GTG CGC AAC GTG TCC GGG GCG TAC CAT GTC ACG 
AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA GTG 
GAC GTG ATC CTG CAC ACC CCT GGG TGC GTG CCC TGC GTT 



39 
78 
117 



WO 95/01442 



PCI7US94/07320 



CGG GAG AAC AAC TCC 
CCC ACG CTC GCG GCC 
ACA ATA CGA CGT CAC 
GCT TTC TGC TCC GCT 
TCT GTT TTC CTT GTT 
CGT CGG CAT GAG ACA 
TAT CCC GGC CAC GTA 
ATG ATG ATG AAC TGG 
TCG CAG TTA CTC CGG 
GTG GCG GGG GCC CAC 
TAC TAT TCC ATG GTG 
GTG ATG CTA CTC TTT 







- 62 








CGT 


TGC 


TGG 


GTG 




AAC 


TCC 


AGC 


GTC 


GTC 


GAT 


TTG 


CTC 


GTT 


ATG 


TAC 


GTG 


GGG 


GAT 


TCC 


CAG 


CTG 


TTC 


ACC 


GTA 


CAG 


GAC 


TGC 


AAC 


ACA 


GGT 


CAC 


CGC 


ATG 


TCG 


CCT 


ACA 


GCA 


GCC 


ATC 


CCA 


CAA 


GCT 


GTC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGG 


AAC 


TGG 


GCT 


AAG 


GCC 


GGC 


GTT 


GAC 


GGG 



GCG CTC ACT 156 

CCC ACT ACG 195 

GGG GCG GCT 234 

CTC TGC GGA 273 

TTC TCG CCT 312 ' 

TGT TCA ATC 351 

GCT TGG GAT 390 

TTA GTG GTA 429 * 

GTG GAC ATG 468 

GGC CTT GCC 507 

GTT CTG ATT 546 

576 



(2) INFORMATION FOR SEQ ID NO: 21: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

I5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

TAT GAA GTG CGC AAC GTG TCC GGG ATG TAC CAT GTC ACG 39 

AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA GCG 78 

20 GAC ATG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT 117 

CGG GAG AAC AAC TCC TCC CGC TGC TGG GTA GCG CTC ACT 156 

CCC ACG CTC GCG GCC AGG AAC TCC AGC GTC CCC ACT ACG 195 

ACA ATA CGA CGC CAC GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCC ATG TAC GTG GGG GAC CTC TGC GGA 273 

TCT GTT TTC CTT GTC TCC CAG CTG TTC ACC TTC TCG CCT 312 

CGC CGG TAT GAG ACA GTA CAG GAC TGC AAT TGC TCA ATC 351 

^ TAT CCC GGC CGC GTA ACA GGT CAC CGC ATG GCT TGG GAT 390 

ATG ATG ATG AAC TGG TCA CCT ACA ACA GCT CTA GTA GTA 429 

TCG CAG TTA CTC CGG ATC CCA CAA GCT ATC GTG GAC ATG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTA GCG GGC CTT GCC 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT TTG ATT 546 

GTT ATG CTA CTC TTT GCC GGC GTT GAC GGG 576 



30 (2) INFORMATION • FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

35 

(vi) ORIGINAL SOURCE: 



WO 95/01442 



PCTYUS94/07320 



10 



- 63 - 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTC 


Ten 






1A1 


CAT 


AAC 


GAC 


TGT 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


GAC 


ATG 


ATC 


ATG 


CAT 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


CGG 


GAG 


GCC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CCC 


ACG 


CTA 


GCA 


GCC 


AGG 


AAC 


ACC 


AGC 


GTC 


CCC 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCT 


TTC 


TGC 


TCC 


GTT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACT 


TTT 


CGC 


CGG 


CAC 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGT 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTG 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GTA 


GOG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


ATG 


CTA 


CTC 




GCT 


GGC 


GTT 


GAC 


GGG 



39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
576 



15 (2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 



25 



30 



35 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 23: 




TAC 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


TAT 


GTC 


ACG 


39 


AAC 


GAC 


TGT 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCT 


GGG 


TGC 


GTG CCC 


TGC 


GTT 


117 


CGG 


GAG 


AGC 


AAT 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTT 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACT 


AAG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACT 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGT 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACG 


GCA 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


CTG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 








576 



(2) INFORMATION FOR SEQ ID NO: 24: 



WO 95/01442 



PCT/US94/07320 
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(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

5 (A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: T10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 



TAT GAA GTG CGC AAC GTG TCC GGG ATG TAG CAT GTC ACG 39 

AAC GAC TGC TCC AAC TCA AGC ATT GTG TTT GAG GCA GOG 78 

GAC TTG ATC ATG CAC ACC CCC GGG TGC GTG CCC TGC GTT 117 

10 CGG GAG GGC AAC TCC TCC CGC TGC TGG GTA GCG CTC ACT 156 

CCC ACG CTC GCG GCC AGG AAC ACC AGC GTC CCC ACT ACG 195 

ACG ATA CGA CGC CAT GTC GAT TTG CTC GTT GGG GCG GCT 234 

GCT TTC TGC TCC GCT ATG TAT GTG GGA GAC CTC TGC GGA 273 

TCT GTT TTC CTC GTC TCT CAG CTG TTC ACC TTC TCG CCT 312 

CGC CGG CAT GAG ACT TTG CAG GAC TGC AAC TGC TCA ATC 351 

TAT CCC GGC CAT CTG TCA GGT CAC CGC ATG GCT TGG GAC 390 

15 ATG ATC ATG AAC TGG TCG CCT ACA ACA GCT CTA GTG GTG 429 

TCG CAG TTA CTC CGG ATC CCA CAA GCT GTC ATG GAC ATG 468 

GTG ACA GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC 507 

TAC TAT TCC ATG GCG GGG AAC TGG GCT AAG GTT TTA ATT 546 

GTG ATG CTA CTC TTT GCC GGC GTT GAT GGG 576 



(2) INFORMATION FOR SEQ ID NO: 25: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

2 5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: US6 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 



TAT GAA GTG CGC AAC GTG TCC GGG ATG TAC CAT GTC ACG 39 

AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA GCG 78 

30 GAC ATG ATC ATG CAC ACT CCC GGG TGC GTG CCC TGT GTT 117 

CGG GAG AAC AAT TCC TCC CGC TGC TGG GTA GOG CTC ACT 156 

CCC ACG CTC GCG GCC AGG AAC GCT AGC GTC CCC ACT ACG 195 ^ 

ACA ATA CGA CGC CAC GTC GAT TTG CTC GTT GGG GCG GCT 234 

ACT TTC TGC TCC GCT ATG TAC GTG GGG GAC CTC TGC GGG 273 

TCC GTT TTC CTC ATC TCC CAG CTG TTC ACC TTC TCG CCT 312 

CGT CAG CAT GAG ACA GTA CAG GAC TGC AAT TGT TCA ATC 351 

35 TAT CCC GGC CAC GTA TCA GGT CAC CGC ATG GCT TGG GAT 390 

ATG ATG ATG AAT TGG TCA CCT ACA GCA GCC CTA GTG GTA 429 



WO 95/01442 



PCT/US94/07320 



10 



15 



20 
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TCG CAG TTA CTC CGG ATC CCA CAA GCT GTC ATG GAC ATG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTT CTG ATT 546 

GTG TTG CTA CTC TTT GCC GGC GTT GAC GGG 576 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 



GCC 


CAA 


GTG 


AGG 


AAC 


ACC 


AGC 


CGC 


GGT 


TAC 


ATG 


GTG 


ACT 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


GAG 


AGC 


ATC 


ACC 


TGG 


CAG 


CTC 


CAA 


78 


GCC 


GCG 


GTT 


CTC 


CAC 


GTC 


CCC 


GGG 


TGT 


ATC 


CCG 


TGT 


GAG 


117 


AGG 


CTG 


GGA 


AAT 


ACA 


TCC 


CGA 


TGC 


TGG 


ATA 


CCG 


GTC 


ACA 


156 


CCA 


AAC 


GTG 


GCC 


GTG 


CGG 


CAG 


CCC 


GGC 


GCT 


CTT 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACG 


CAC 


ATC 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCT 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGC 


273 


GGG 


GTG 


ATG 


CTC 


GCA 


GCC 


CAG 


ATG 


TTC 


ATT 


GTC 


TCG 


CCG 


312 


GGA 


CGC 


CAC 


TGG 


TTT 


GTG 


CAA 


GAA 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAC 


CCC 


GGT 


ACC 


ATC 


ACT 


GGA 


CAC 


GGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


GCC 


ACC 


ATG 


ATC 


CTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATC 


GGC 


GGG 


GCT 


CAC 


TGG 


GGC 


GTC 


ATG 


TTT 


GGC 


TTG 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAG 


GTC 


ATT 


GTC 


546 


ATC 


CTC 


TTG 


CTG 


GCT 


GCT 


GGG 


GTG 


GAC 


GCG 








576 



25 (2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T4 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:27: 

GCA CAA GTG AAG AAC ACC ACT AAC AGC TAC ATG GTG ACC 39 
35 AAC GAC TGT TCT AAT GAC AGC ATC ACT TGG CAG CTC CAG 78 
GCC GCG GTC CTC CAC GTC CCC GGG TGT GTC CCG TGC GAG 117 



* 



WO 95/01442 



PCT/US94/07320 



AAA 


ACG 


GGA 






rnrrm 

lex 


CoVj 


- bo 
ivaC 


TGG 


ATA 


CCG 


GTT 


TCA 


156 


CCA 


AAC 


\J X Vj 


cjpp 


\3 X\J 






CCC 


GGC 


GCC 


CTC 


ACG 


CAG 


195 




X Xw 






par 1 

l«4iC 


Al 1 


CaAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 




PT'P 






PPT 


CI X 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


0m 

GGC 


273 


GGG 


GTG 


ATG 


CTC 


CPA 


CPP 




Alb 


X 1L 


ATC 


GTC 


TCG 


CCG 


312 


CAA 


CAT 


CAC 


TGG 




GTG 


CAA 


GAC 


TGC 


AAT 


TGC 


TCT 


ATC 


351 


TAC 


CCT 


GGC 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACG 


GCC 


ACC 


ATG 


ATC 


CTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


TTA 


GAC 


ATC 


468 


GTT 


AGC 


GGG 


GCA 


CAC 


TGG 


GGC 


GTC 


ATG 


TTC 


GGC 


TTG 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAA 


GTC 


GTT 


GTC 


546 


ATC 


CTT 


CTG 


CTG 


GCC 


GCT 


GGG 


GTG 


GAC 


GCG 








576 



10 



15 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 



GCC 


GAA 


GTG 


AAG 


AAC 


ACC 


AGT 


ACC 


AGC 


TAC 


ATG 


GTG 


ACA 


39 


AAT 


GAC 


TGT 


TCC 


AAC 


GAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


CAG 


78 


GCC 


GCG 


GTC 


CTC 


CAC 


GTC 


CCC 


GGG 


TGC 


GTC 


CCG 


TGC 


GAG 


117 


AGA 


GTT 


GGA 


AAC 


GCG 


TCG 


CGG 


TGC 


TGG 


ATA 


CCG 


GTC 


TCG 


156 


CCA 


AAC 


GTA 


GCT 


GTG 


CAG 


CGG 


CCT 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACG 


CAC 


ATC 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCC 


GCT 


CTC 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGC 


273 


GGG 


GTA 


ATG 


CTC 


GCC 


GCT 


CAG 


ATG 


TTC 


ATT 


ATC 


TCG 


CCG 


312 


CAG 


CAC 


CAC 


TGG 


TTT 


GTG 


CAG 


GAA 


TGC 


AAC 


TGC 


TCC 


ATT 


351 


TAC 


CCT 


GGT 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACC 


ACC 


ATG 


ATC 


TTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATC 


AGC 


GGA 


GCT 


CAC 


TGG 


GGC 


GTC 


ATG 


TTC 


GGC 


CTA 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAG 


GTC 


GTT 


GTC 


546 


ATC 


CTG 


TTG 


CTC 


ACC 


GCT 


GGC 


GTG 


GAC 


GCG 








576 



30 (2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 

(vi) ORIGINAL SOURCE: 



WO 95/01442 PCT/US94/07320 



- 67 - 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: 10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 











up 


ALL 


7\ rirrt 


ACC 


ACrC 


TAT 


H FTV1 

ATG 


GTG 


ACC 


39 


AAT 


GAC 


TGC 


TCC 


AAC 


GAC 


AGC 


ATC 


ACT 


TGG 


CAA 


CTT 


GAG 


78 


GCT 


GCG 


GTC 


CTC 


CAC 


GTT 


CCC 


GGG 


TGT 


GTC 


CCG 


TGC 


GAG 


117 


AAA 


GTG 


GGA 


AAT 


ACA 


TCT 


CGG 


TGC 


TGG 


ATA 


CCG 


GTC 


TCA 


156 


CCA 


AAT 


GTG 


GCC 


GTG 


CAG 


CGG 


CCT 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


G6C 


TTG 


CGG 


ACT 


CAC 


ATC 


GAC 


ATG 


GTC 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCC 


GCT 


CTT 


TAC 


GTG 


GGG 


GAC 


TTC 


TGC 


GGT 


273 


GGG 


ATG 


ATG 


CTC 


GCA 


GCC 


CAA 


ATG 


TTC 


ATT 


GTC 


TCG 


CCG 


312 


CGC 


CAC 


CAC 


TCG 


TTT 


GTG 


CAG 


GAA 


TGC 


AAC 


TGC 


TCC 


ATC 


351 


TAC 


CCC 


GGT 


ACC 


ATC 


ACC 


GGG 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACG 


GCC 


ACT 


TTG 


ATC 


CTG 


429 


GCG 


TAC 


GTG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATT 


AGC 


GGG 


GCG 


CAT 


TGG 


GGC 


GTC 


TTG 


TTC 


GGC 


TTA 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAA 


GTC 


GTT 


GTC 


546 


ATC 


CTT 


CTG 


CTA 


GCC 


GCT 


GGG 


GTG 


GAC 


GCG 
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15 (2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS : single 
(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



GTG GAA GTC AGG AAC ATC AGT TCC AGC TAC TAC GCC ACC 39 

25 AAT GAT TGC TCA AAC AAC AGC ATC ACC TGG CAA CTC ACC 78 

GAC GCA GTT CTC CAC CTT CCC GGA TGC GTC CCA TGT GAG 117 

AAT GAC AAT GGC ACC CTG CGC TGC TGG ATA CAA GTG ACA 156 

CCT AAT GTG GCT GTG AAA CAC CGC GGC GCA CTT ACT CAT 195 

AAC CTG CGA ACA CAC GTC GAC GTG ATC GTA ATG GCA GCT 234 

ACG GTC TGC TCG GCC TTG TAT GTG GGA GAC GTA TGC GGG 273 

GCC GTG ATG ATC GTG TCG CAG GCT CTC ATA ATA TCG CCT 312 

GAA CGC CAC AAC TTT ACC CAG GAG TGC AAC TGT TCC ATC 351 

30 TAC CAA GGT CAT ATC ACC GGC CAC CGC ATG GCA TGG GAC 390 

ATG ATG CTA AAC TGG TCA CCA ACT CTT ACC ATG ATC CTC 429 

GCC TAT GCC GCT CGT GTT CCT GAG CTA GCC CTC CAG GTT 468 

GTC TTC GGC GGC CAT TGG GGC GTG GTG TTT GGC TTG GCC 507 

TAT TTC TCC ATG CAG GGA GCG TGG GCC AAA GTC ATT GCC 546 

ATC CTC CTT CTT GTC GCA GGA GTG GAT GCA 576 



35 



(2) . INFORMATION FOR SEQ ID NO: 31: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK11 





(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


NO:31: 




GTG 


GAA 


GTC 


AGG 


AAC 


ACC 


AGT 


TCT 


AGT 


TAC 


TAC 


GCC 


ACC 


39 


AAT 


GAT 


TGC 


TCA 


AAC 


AAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


78 


AAC 


GCA 


GTT 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCA 


TGT 


GAG 


117 


AAT 


GAC 


AAT 


GGC 


ACC 


CTG 


CAC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


156 


CCT 


AAT 


GTG 


GOT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCA 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


GCA 


CAT 


ATA 


GAT 


ATG 


ATT 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TOG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


GTG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


TTC ATA 


GTA 


TCG 


CCA 


312 


GAA 


CAC 


CAC 


CAC 


TTT 


ACC 


CAA 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CAC 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


CTT 


AAC 


TGG 


TCA 


CCA 


ACT 


CTC 


ACC 


ATG 


ATC 


CTC 


429 


GCC 


TAT 


GCC 


GCC 


CGT 


GTT 


CCT 


GAG 


CTA 


GTC 


CTT 


GAA 


GTC 


468 


GTC 


TTC 


GGT 


GGT 


CAT 


TGG 


GGT 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


TAT 


TTC 


TCC 


ATG 


CAG 


GGA 


GCG 


TGG 


GCC 


AAG 


GTC 


ATT 


GCC 


546 


ATC 


CTC 


CTT 


CTT 


GTA 


GCA 


GGA 


GTG 


GAT 


GCA 
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(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

GTG GAA GTC AGG AAC ATC AGT TCT AGC TAC TAT GCC ACC 39 

AAT GAT TGC TCA AAC AGC AGC ATC ACC TGG CAA CTC ACC 78 

30 AAC GCA GTC CTC CAC CTT CCC GGA TGC GTC CCG TGT GAG 117 

AAT GAT AAT GGC ACC CTG CAC TGC TGG ATA CAA GTG ACA 156 

CCT AAT GTG GCT GTG AAA CAC CGC GGC GCG CTC ACT CAC 195 

AAC CTG CGA GCA CAC GTC GAT ATG ATC GTA ATG GCA GCT 234 

ACG GTC TGC TCG GCC TTG TAT GTG GGA GAC ATG TGC GGG 273 

GCC GTG ATG ATC GTG TCG CAG GCT TTC ATA ATA TCG CCA 312 

GAA CGC CAC AAC TTT ACC CAA GAG TGC AAC TGT TCC ATC 351 

35 TAC CAA GGT CGT ATC ACC GGC CAC CGC ATG GCG TGG GAC 390 

ATG ATG CTA AAC TGG TCA CCA ACT CTT ACC ATG ATC CTT 429 
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GCC TAT GCC GCT CGT GTT CCT GAG CTA GTC CTT GAA GTT 468 

GTC TTC GGC GGC CAT TGG GGC GTG GTG TTT GGC TTG GCC 507 

TAT TTC TCC ATG CAA GGA GCG TGG GCC AAG GTC ATT GCC 546 

ATC CTC CTG CTT GTC GCA GGA GTG GAT GCA 576 

(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T8 
SEQUENCE DESCRIPTION: SEQ ID NO: 33: 





(Xi) 


SEQUENCE 


GTG 


GAA 


GTT 


AGA 


AAC 


ACC 


AAT 


GAT 


TGC 


TCG 


AAC 


AAC 


AAC 


GCA 


GTT 


CTC 


CAC 


CTT 


AAT 


GAC 


AAT 


GGC 


ACC 


TTG 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


AAC 


CTG 


CGA 


ACG 


CAT 


GTC 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


GCC 


GTG 


ATG 


ATA 


GCG 


TCG 


GAA 


CGC 


CAC 


AAC 


TTC 


ACC 


TAC 


CAA 


GGT 


CAT 


ATC 


ACC 


ATG 


ATG 


CTG 


AAC 


TGG 


TCA 


GCC 


TAC 


GCT 


GCT 


CGT 


GTG 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


TAT 


TTC 


TCC 


ATG 


CAA 


GGA 


ATC 


CTC 


CTC 


CTT 


GTC 


GCA 



AGC 


TAC 


TAC 


GCC 


ACC 


39 


ACC 


TGG 


CAG 


CTC 


ACC 


78 


TGC 


GTC 


CCA 


TGT 


GAG 


117 


TGG 


ATA 


CAA 


GTA 


ACA 


156 


GGC 


GCA 


CTC 


ACT 


CAC 


195 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


GGG 


GAC 


GTG 


TGC 


GGG 


273 


TTC 


ATA 


ATA 


TCG 


CCA 


312 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


CTC 


ACC 


ATG 


ATC 


CTC 


429 


CTA 


GTC 


CTT 


GAA 


GTT 


468 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


GCC 


AAA 


GTC 


ATC 


GCC 


546 


GAC 


GCA 








576 



25 (2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GTG GAG GTC AAG GAC ACC GGC GAC TCC TAC ATG CCG ACC 39 
35 AAC GAT TGC TCC AAC TCT AGT ATC GTT TGG CAG CTT GAA 78 
GGA GCA GTG CTT CAT ACT CCT GGA TGC GTC CCT TGT GAG 117 



It 
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5 



CGT 


ACC 


GCC 

VJ Vm» 


AAC 


GTC 


X ^ X 






rpriri 


flTC 




uli 


nee 


loo 


ccc 


AAT 

nn x 


PTP 


GCC 


ATA 


AHT 


pa a 




wt 


urV~x 


Lit 






X95 


GGC 


CTG 
^ x\j 


CGA 


HPA 


CAC 


ATP 

AX C 


Vsnl 


Alt 


ATP 




Tiny** 




PPT 




ACG 


GTC 


TGT 


TCT 


GCC 


CTT 


TAT 


GTG 


GGG 


GAC 


GTG 


TGT 


GGC 


273 


6CG 


CTG 


ATG 


CTG 


GCC 


GCT 


CAG 


GTC 


GTC 


GTC 


GTG 


TCG 


CCA 


312 


CAA 


CAC 


CAT 


ACG 


TIT 


GTC 


CAG 


GAA 


TGC 


AAC 


TGT 


TCC 


ATA 


351 


TAC 


CCG 


GGC 


CGC 


ATT 


ACG 


GGA 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACT 


ACC 


ACC 


ATG 


CTC 


CTG 


429 


GCG 


TAC 


TTG 


GTG 


CGC 


ATC 


CCG 


GAA 


GTC 


ATC 


TTG 


GAT 


ATT 


468 


GTT 


ACA 


GGA 


GGT 


CAT 


TGG 


GGT 


GTA 


ATG 


TIT 


GGC 


CTC 


GCT 


507 


TAC 


TTC 


TCC 


ATG 


CAG 


GGA 


TCG 


TGG 


GCG 


AAG 


GTC 


ATC 


GTT 


546 


ATC 


CTC 


CTG 


CTG 


ACT 


GCT 


GGG 


GTG 


GAG 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 35: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

j 5 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK12 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

TTA GAG TGG CGG AAT GTG TCC GGC CTC TAC GTC CTT ACC 39 

AAC GAC TGT TCC AAT AGC AGT ATC GTG TAT GAG GCC GAT 78 

20 GAC GTC ATT CTG CAC ACA CCT GGC TGT GTA CCT TGT GTT 117 

CAG GAC GGC AAT ACA TCT ACG TGC TGG ACC TCA GTG ACG 156 

CCT ACA GTG GCA GTC AGG TAC GTC GGA GCA ACC ACC GCT 195 

TCG ATA CGC AGT CAT GTG GAC CTG CTA GTG GGC GCG GCC 234 

ACG ATG TGC TCT GCG CTC TAC GTG GGT GAT GTG TGT GGG 273 

GCC GTC TTC CTT GTG GGA CAA GCC TTC ACG TTC AGA CCT 312 

CGT CGC CAT CAA ACA GTC CAG ACC TGT AAC TGC TCG CTG 351 

2 c TAC CCA GGC CAT CTT TCA GGA CAT CGA ATG GCT TGG GAT 390 

ATG ATG ATG AAT TGG TCC CCC GCT GTG GGT ATG GTG GTA 429 

GCG CAC GTC CTG CGT CTG CCC CAG ACC TTG TTC GAC ATA 468 

ATA GCT GGG GCC CAT TGG GGC ATC ATG GCG GGC CTA GCC 507 

TAT TAC TCC ATG CAG GGC AAC TGG GCC AAG GTC GCT ATC 546 

ATC ATG GTT ATG TTT TCA GGA GTC GAT GCC 576 



30 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

5 (vi) ORIGINAL SOURCE: 
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(A) ORGANISM: homosapiens 
<C) INDIVIDUAL ISOLATE: HK10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



CTA 


GAG 


TGG 


CGG 


AAT 




11*1 






TAT 


GTC 


CTT 


fm fa 

ACC 


AAC 


GAC 


TGT 


CCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


TCG 


GTG 


ACA 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCC 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTG 


TTA 


GTG 


GGC 


GCG 


GCC 


ACG 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGC 


GAT 


ATG 


TGT 


GGG 


GCC 


GTC 


TTC 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCG 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


TAC 


CCA 


GGC 


CAC 


CTT 


TCA 


GGA 


CAT 


CGA 


ATG 


GCT 


TGG 


GAT 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCC 


GTG 


GGT 


ATG 


GTG 


GTG 


GCG 


CAC 


GTC 


CTG 


CGG 


TTG 


CCC 


CAG 


ACC 


TTG 


TTC 


GAC 


ATA 


ATA 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCA 


GGC 


CTA 


GCC 


TAT 


TAC 


TCC 


ATG 


CAG 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAT 


GCC 







39 
78 
117 
156 
195 
234 
273 
312 
351 
390 
429 
468 
507 
546 
576 



15 (2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



25 



30 



CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTC 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTT 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGT 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


156 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAT 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTG 


GTG 


GGC 


GCG 


GCC 


234 


ACT 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


312 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


429 


GCG 


CAC 


GTT 


CTG 


CGT 


TTG 


CCC 


CAG 


ACC 


GTG 


TTC 


GAC 


ATA 


468 


ATA 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCG 


GGC 


CTA 


GCC 


507 


TAT 


TAC 


TCC 


ATG 


CAA 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


546 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAC 


GCC 






576 



35 



(2) INFORMATION FOR SEQ ID NO: 38 



* 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

5 (A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: S52 

(xi) SEQUENCE DESCRIPTION: SEQ 3D NO: 38: 



CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT GTC CTT ACC 39 

AAC GAC TGT TCC AAT AGC AGT ATT GTG TAT GAG GCC GAT 78 

GAC GTC ATT CTG CAC ACA CCC GGC TGT GTA CCT TGT GTT 117 

10 CAG GAC GGC AAT ACA TCC ATG TGC TGG ACC CCA GTG ACA 156 

CCT ACG GTG GCA GTC AGG TAC GTC GGA GCA ACC ACC GCT 195 

TCG ATA CGC AGT CAT GTG GAC CTA TTA GTG GGC GCG GCC 234 

ACG CTG TGC TCT GCG CTC TAT GTG GGT GAT ATG TGT GGG 273 

GCC GTC TTT CTC GTG GGA CAA GCC TTC ACG TTC AGA CCT 312 

CGT CGC CAT CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG 351 

TAC CCA GGC CAT GTT TCA GGA CAT CGA ATG GCT TGG GAT 390 

ls ATG ATG ATG AAT TGG TCC CCC GCT GTG GGT ATG GTG GTG 429 

GCG CAC ATC CTG CGA TTG CCC CAG ACC TTG TTT GAC ATA 468 

CTG GCC GGG GCC CAT TGG GGC ATC TTG GCG GGC CTA GCC 507 

TAT TAT TCT ATC CAG GGC AAC TGG GCC AAG GTC GCT ATT 546 

GTC ATG ATT ATG TTT TCA GGG GTC GAT GCC 576 



(2) INFORMATION FOR SEQ ID NO: 39: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

2S (vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S54 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



CTA GAG TGG CGG AAT ACG TCT GGC CTC TAT ATC CTT ACC 39 

AAC GAC TGT TCC AAT AGC AGT ATT GTG TAT GAG GCC GAT 78 * 

30 GAC GTC ATT CTG CAC ACA CCC GGC TGT GTA CCT TGT GTT 117 

CAG GAC GGC AAT ACA TCC ACG TGC TGG ACC CCA GTG ACA 156 

CCT ACG GTG GCA GTC AGG TAC GTC GGA GCA ACC ACC GCT 195 - 

TCG ATA CGC AGT CAT GTG GAC CTA TTA GTG GGC GCG GCC 234 

ACG CTG TGC TCT GCG CTC TAT GTG GGT GAT ATG TGT GGG 273 

GCC GTC TTT CTC GTG GGA CAA GCC TTC ACG TTC AGA CCT 312 

CGT CGC CAT CAA ACG GTC CAG ACC TGT AAC TGC TCG CTG 351 

TAC CCA GGC CAT CTT TCA GGA CAT CGA ATG GCT TGG GAT 390 

ATG ATG ATG AAT TGG TCC CCC GCT GTG GGT ATG GTG GTG 429 
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GCG CAC ATC CTG CGA TTG CCC CAG ACC TTG TTT GAC ATA 468 

CTG GCC GGG GCC CAT TGG GGC ATC TTG GCG GGC CTA GCC 507 

TAT TAT TCT ATG CAG GGC AAC TGG GCC AAG GTC GCT ATC 546 

ATC ATG ATT ATG TTT TCA GGG GTC GAT GCC 576 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z4 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



GAG 


CAC 


TAC 


CGG 


AAT 


GCT 


TCG 


GGC 


ATC 


TAT 


CAC 


ATC 


ACC 


39 


AAT 


GAT 


TGT 


CCG 


AAT 


TCC 


AGT 


ATA 


GTC 


TAT 


GAA 


GCT 


GAC 


78 


CAT 


CAC 


ATC 


CTA 


CAC 


TTG 


CCG 


GGG 


TGC 


GTA 


CCC 


TGT 


GTG 


117 


ATG 


ACT 


GGG 


AAC 


ACA 


TCG 


CGT 


TGC 


TGG 


ACG 


CCG 


GTG 


ACG 


156 


CCT 


ACA 


GTG 


GCT 


GTC 


GCA 


CAC 


CCG 


GGC 


GCT 


CCG 


CTT 


GAG 


195 


TCG 


TTC 


CGG 


CGA 


CAT 


GTG 


GAC 


TTA 


ATG 


GTA 


GGC 


GCG 


GCC 


234 


ACT 


TTG 


TGT 


TCT 


GCC 


CTC 


TAT 


GTT 


GGG 


GAC 


CTC 


TGC 


GGA 


273 


GGT 


GCC 


TTC 


CTG 


ATG 


GGG 


CAG 


ATG 


ATC 


ACT 


TTT 


CGG 


CCG 


312 


CGT 


CGC 


CAC 


TGG 


ACC 


ACG 


CAG 


GAG 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAC 


ACT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


AGG 


ATG 


GCG 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGC 


CCT 


ACC 


ACC 


ACT 


CTG 


CTC 


CTC 


429 


GCC 


CAG 


ATC 


ATG 


AGG 


GTC 


CCC 


ACA 


GCC 


TTT 


CTC 


GAC 


ATG 


468 


GTT 


GCC 


GGA 


GGC 


CAC 


TGG 


GGC 


GTC 


CTC 


GCG 


GGC 


TTG 


GCG 


507 


TAC 


TTC 


AGC 


ATG 


CAA 


GGC 


AAT 


TGG 


GCC 


AAG 


GTA 


GTC 


CTG 


546 


GTC 


CTT 


TTC 


CTC 


TTT 


GCT 


GGG 


GTA 


GAC 


GCC 








576 



25 (2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

GTG CAC TAC CGG AAT GCT TCG GGC GTC TAT CAT GTC ACC 39 
AAT GAT TGC CCT AAC ACC AGC ATA GTG TAC GAG ACG GAG 78 
CAC CAC ATC ATG CAC TTG CCA GGG TGT GTC CCC TGT GTG 117 



WO 95/01442 PCT/US94/07320 



CGG ACG GAG ART ACT TCT CGC 
CCC ACT GTG GCC GCG CCC TAT 
TCC ATG CGC AGG CAT GTA GAC 
ACT ATG TGT TCC GCC TTC TAC 
GGC GTC TTC CTA GTG GGC CAG 
CGC CGG CAC TGG ACC ACC CAG 
TAT CCT GGT CAC GTC TCG GGC 
ATG ATG ATG AAC TGG AGC CCT 
GCT CAG ATC TTA CGG ATC CCC 
CTC ACC GGG GGT CAC TGG GGA 
TTC TTC AGC ATG CAG AGT AAC 
GTC CTA TTC CTC TTT GCC GGG 



74 - 



IbL 








I i\j 


ACC 


15 b 






GCA 


CCG 


TTA 


GAG 


195 


CTG 


ATG 


GTG 


GGT 


GCG 


GCT 


234 


ATT 


GGA 


GAT 


CTG 


TGT 


GGA 


273 


CTG 


TTC 


GAC 


TTC 


CGA 


CCG 


312 


GAT 


TGC 


AAC 


TGC 


TCC 


ATC 


351 


CAC 


AGG 


ATG 


GCC 


TGG 


GAC 


390 


ACC 


AGC 


GCG 


CTG 


ATT 


ATG 


429 


TCT 


ATC 


CTA 


GGT 


GAC 


TTG 


468 


GTT 


CTT 


GCT 


GGT 


CTA 


GCT 


507 


TGG 


GCG 


AAG 


GTC 


ATC 


CTG 


546 


GTC 


GAG 


GGA 
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(2) INFORMATION FOR SEQ ID NO: 42: 

10 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



GTT AAC TAT CGC AAT GCC TCG GGC GTC TAT CAC GTC ACC 39 

AAC GAC TGC CCG AAC TCG AGC ATA GTG TAT GAG GCC GAA 78 

20 CAC CAG ATC TTA CAC CTC CCA GGG TGC TTG CCC TGT GTG 117 

AGG GTT GGG AAT CAG TCA CGC TGC TGG GTG GCC CTT ACT 156 

CCC ACC GTG GCG GTG TCT TAT ATC GGT GCT CCG CTT GAC 195 

TCC CTC CGG AGA CAT GTG GAC CTG ATG GTG GGC GCC GCT 234 

ACT GTA TGC TCT GCC CTC TAC GTT GGA GAT CTG TGC GGT 273 

GGT GCA TTC TTG GTT GGC CAG ATG TTC TCC TTC CAG CCG 312 

CGA CGC CAC TGG ACT ACG CAG GAC TGC AAT TGT TCT ATC 351 

2 - TAC GCA GGG CAT ATC ACG GGC CAC AGG ATG GCA TGG GAC 390 

ATG ATG ATG AAC TGG AGT CCC ACA ACC ACC CTG CTT CTC 429 

GCC CAG GTC ATG AGG ATC CCT AGC ACT CTG GTA GAT CTA 468 

CTC GCT GGA GGG CAC TGG GGC GTC CTT GTT GGG TTG GCG 507 

TAC TTC AGT ATG CAA GCT AAT TGG GCC AAA GTC ATC CTG 546 

GTC CTT TTC CTC TTC GCT GGA GTT GAT GCC 576 



30 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



35 



(vi) ORIGINAL SOURCE: 



PCT7US94/07320 

9 



WO 95/01442 



10 



- 75 - 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 





(Xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:43: 




GTC 


AAC 


TAT 


CAC 


AAT 


GCC 


TCG 


GGC 


GTC 


TAT 


CAC 


ATC 


ACC 


39 


AAC 


GAC 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


ATG 


TAT 


GAG 


GCC 


GAA 


78 


CAC 


CAC 


ATC 


CTA 


CAC 


CTC 


CCA 


GGG 


TGC 


GTA 


CCC 


TGT 


GTG 


117 


AGG 


GAG 


GGG 


AAC 


CAG 


TCA 


CGC 


TGC 


TGG 


GTG 


GCC 


CTT 


ACT 


156 


CCC 


ACC 


GTG 


GCG 


GCG 


CCT 


TAT 


ATC 


GGT 


GCA 


CCG 


•CTT 


GAA 


195 


TCC 


ATC 


CGG 


AGA 


CAT 


GTG 


GAC 


CTG 


ATG 


GTA 


GGC 


GCT 


GCT 


234 


ACA 


GTG 


TGC 


TCC 


GCT 


CTC 


TAC 


ATT 


GGG 


GAC 


CTG 


TGC 


GGT 


273 


GGC 


GTA 


TTT 


TTG 


GTT 


GGT 


CAG 


ATG 


TTT 


TCT 


TTC 


CAG 


CCG 


312 


CGA 


CGC 


CAC 


TGG 


ACT 


ACG 


CAG 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAT 


GCG 


GGG 


CAC 


GTT 


ACA 


GGC 


CAC 


AGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGT 


CCC 


ACA 


ACC 


ACC 


TTG 


GTC 


CTC 


429 


GCC 


CAG 


GTT 


ATG 


AGG 


ATC 


CCT 


AGC 


ACT 


CTG 


GTG 


GAC 


CTA 


468 


CTC 


ACT 


GGA 


GGG 


CAC 


TGG 


GGT 


ATC 


CTT 


ATC 


GGG 


GTG 


GCA 


507 


TAC 


TTC 


TGC 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAG 


GTC 


ATT 


CTG 


546 


GTC 


CTT 


TTC 


CTC 


TAC 


GCT 


GGA 


GTT 


GAT 


GCC 






576 



15 (2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 



25 



30 



35 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 44: 




TAC 


AAC 


TAT 


CGC 


AAC 


AGC 


TCG 


GGT 


GTC 


TAC 


CAT 


GTC 


ACC 


39 


AAC 


GAT 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


GTC 


TAT 


GAA 


ACC 


GAT 


78 


TAC 


CAC 


ATC 


TEA 


CAC 


CTC 


CCG 


GGA 


TGC 


GTT 


CCT 


TGC 


GTG 


117 


AGG 


GAA 


GGG 


AAC 


AAG 


TCT 


ACA 


TGC 


TGG 


GTG 


TCT 


CTC 


ACC 


156 


CCC 


ACC 


GTG 


GCT 


GCG 


CAA 


CAT 


CTG 


AAT 


GCT 


CCG 


CTT 


GAG 


195 


TCT 


TTG 


AGA 


CGT 


CAC 


GTG 


GAT 


CTG 


ATG 


GTG 


GGC 


GGC 


GCC 


234 


ACT 


CTC 


TGC 


TCC 


GCC 


CTC 


TAC 


ATC 


GGA 


GAC 


GTG 


TGT 


GGG 


273 


GGT 


GTG 


TTC 


TTG 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


CAA 


CCT 


312 


CGC 


CGC 


CAC 


TGG 


ACC 


ACC 


CAA 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAC 


ACA 


GGA 


CAT 


ATC 


ACA 


GGA 


CAC 


AGA 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


AGC 


CCC 


ACT 


GCG 


ACG 


CTG 


GTC 


CTC 


429 


GCC 


CAA 


CTT 


ATG 


AGG 


ATC 


CCA 


GGC 


GCC 


ATG 


GTC 


GAC 


CTG 


468 


CTT 


GCA 


GGC 


GGC 


CAC 


TGG 


GGC 


ATT 


CTG 


GTT 


GGC 


ATA 


GCG 


507 


TAC 


TTC 


AGC 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAG 


GTT 


ATC 


CTG 


546 


GTC 


CTG 


TTT 


CTC 


TTT 


GCT 


GGA 


GTC 


GAC 


GCT 
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(2) INFORMATION FOR SEQ ID NO: 45: 



WO 95/01442 PCT7US94/07320 



10 



15 



20 



25 



30 
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- 76 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 



GTT 


CCC 


TAC 


CGG 


AAT 


GCC 


TCT 


GGG 


GTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAC 


TGC 


CCA 


AAC 


TCC 


TCC 


ATA 


GTC 


TAC 


GAG 


GCT 


GAT 


78 


AGC 


CTG 


ATC 


TTG 


CAC 


GCA 


CCT 


GGC 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AGG 


CAA 


GAT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACA 


CTG 


TCA 


GCC 


CCG 


ACC 


TTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGA 


GCT 


234 


GCT 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGC 


GAC 


GCG 


TGC 


GGG 


273 


GCA GTG 


TTT 


CTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


ACC 


ACA 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


CTG 


ATG 


429 


GCC 


CAG 


ATG 


CTA 


CGG 


ATC 


CCC 


CAG 


GTG 


GTC 


ATA 


GAC 


ATC 


468 


ATA GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTT 


GCC 


GCC 


GCA 


507 


TAC 


TTT 


GCG 


TCG 


GCC 


GCC 


AAC 


TGG 


GCT 


AAG 


GTA 


GTG 


CTG 


546 


GTT 


CTG 


TTC 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 
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(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 





(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 46: 




GTT 


CCC 


TAC 


CGA 


AAC 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTT 


TAC 


GAG 


GCT 


GAT 


78 


AAC 


CTG 


ATC 


TTG 


CAT 


GCA 


CCT 


GGT 


TGC 


GTG 


CCT 


TGT 


GTC 


117 


AGG 


CAA 


GAT 


AAT 


GTC 


AGT 


AAG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACG 


TTG 


TCA 


GCC 


CCG 


AAT 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAC 


ACT 


ACG 


GTG 


CAA 


GAC 


TGC 


AAT 


TGC 


TCT 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACG 


GCC 


TTG 


CTG 


ATG 


429 
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S JTG CGG ATT CCC CAG GTG GTC ATC GAC ATC 468 

ATT GCC GGG GGC CAC TGG GGG GTC TTG TTT GCC GCC GCA 507 

TAT TTC GOG TCA GCG GCT AAC TGG GCT AAG GTT ATA CTG 546 

GTC TTG TTT CTG TTT GCG GGG GTC GAT GCC 576 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA5 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


GTC 


CCC 


TAC 


CGA 


AAT 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTC 


TAC 


GAG 


AAC 


CTG 


ATT 


CTG 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


AAG 


GAA 


GGT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


CCC 


ACA 


TTG 


TCA 


GCC 


CCG 


AAC 


CTC 


GGA 


GCG 


GTC 


CCT 


CTT 


CGG 


AGG 


GTC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


GCA 


GTG 


TTC 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


CGC 


CAG 


CAT 


ACT 


ACG 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TAC 


AGC 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGA 


ATG 


GCA 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


GCC 


CAG 


GTG 


CTA 


CGG 


ATT 


CCC 


CAA 


GTG 


GTC 


ATT 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


TAC 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTC 


CTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 



rTC ACC 39 

\CT GAT 78 

X3T GTC 117 

lTC ACC 156 

iCG GCT 195 

rGG GCT 234 

GC GGG 273 

GG CCT 312 

CC ATT 351 

GG GAC 390 

TG ATG 429 

AC ATC 468 

TC GCA 507 

TG CTG 546 

576 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

GTT CCT TAC CGG AAT GCC TCT GGG GTG TAT CAT GTT ACC 39 

AAT GAT TGC CCA AAC TCT TCC ATA GTC TAT GAG GCT GAT 78 

35 GAC CTG ATC CTA CAC GCA CCT GGC TGC GTG CCC TCT GTC 117 

CGG AAG GAT AAT GTC AGT AGA TGC TGG GTT CAT ATC ACC 156 



ft' 



WO 95/01442 
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- 78 - 



20 



25 





Awn 


in 


X wA 


GCC 


CCG AGC 














195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAT 


TAC 


TTG 


GCG 


GGA 


GGG 


GCC 


234 


GCC 


CTG 


TGC 


TCC 


GCG 


TTA 


TAC 


GTC 


GGA 


GAC 


GTG 


TGC 


GGG 


273 


GCA 

wWl 


TTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ALL 


rri-n rp 


Aksfa 




312 


CGC 


CAG 


CAT 


GCT 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACT 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCC 


GCG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAA 


ATG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCT 


GCA 


507 


TAC 


TTC 


GCG 


TCG 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


TTG 




CTG 


TTT 


GCG 


GGG 


GTT 


GAT 


GCC 








576 


(2) 


INFORMATION FOR SEQ ID 


NO:49: 













(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
15 (C) INDIVIDUAL ISOLATE: SA7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49 



GTC 


CCC 


TAC 


CGA 


AAT 


GCC 


TCC 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCG 


AAC 


TCT 


TCC 


ATA 


GTC 


TAT 


GAG 


GCT 


GAC 


78 


AAC 


CTG 


ATC 


CTG 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AGA 


CAA 


AAT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACA 


TTG 


TCA 


GCC 


CCG 


AAC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


CTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCG 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGC 


CAG 


ATG 


TTC 


AGC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAC 


ACT 


ACG 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAG 


TTG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATC 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCC 


GCA 


507 


TAT 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GCC 








576 


(2) 


INFORMATION FOR SEQ ID 


NO: 50: 













(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

35 (C) INDIVIDUAL ISOLATE: SA13 



WO 95/01442 PCT/US94/07320 
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20 



30 



- 79 - 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50 



VJl <L 


ccc 


tap 


WW"A 


AAT 


GAT 






GAC 


CTCX 


ATP 


• 1 *• 1 "TV 


AGG 


CAG 


GGT 

WW X 


7V 7V»T» 
AA1 


ccc 


ACA 


CTG 


TCA 


CCT 


CTT 


CGG 


AGG 


GCC 


CTT 


TGC 


TCC 


GCA 


GTG 


TTT 


TTG 


CGC 


CGG 


CAT 


AAT 


TAC 


AGT 


GGC 


CAC 


ATG 


ATG 


ATG 


AAT 


GCC 


CAG 


TTG 


TTA 


ATT 


GCC 


GGG 


GCC 


TAC 


TAC 


GCG 


TCG 


GTC 


CTG 


TTT 


CTG 



1LT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


ill 


ATC 


mm/1 

GTC 


TAC 


GAG 


GCT 


GAT 


78 






TGC 


GTG 


CCC 


TGT 


GTT 


117 


AGG 








CAG 


ATC 


ACC 


156 


AGC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


GAC 


TAC 


TTA 


GCG 


GGG 


GGG 


GCT 


234 


TAC 


GTC 


GGA 


GAC 


GCG 


TGC 


GGG 


273 


CAA 


ATG 


TTC 


ACC 


TAT 


AGC 


CCT 


312 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


CCT 


ACA 


ACA 


GCT 


TTG 


GTG 


ATG 


429 


CCC 


CAG 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


GGG 


GTC 


TTG 


TTC 


GCC 


GCC 


GCA 


507 


AAC 


TGG 


GCC 


AAG 


GTT GTG 


CTG 


546 


GGG 


GTC 


GAT 


GCC 








576 



(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK2 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



CTT ACC TAC GGC AAC TCC AGT GGG CTA TAC CAT CTC ACA 39 

AAT GAT TGC CCC AAC TCC AGC ATC GTG CTG GAG GCG GAT 78 

GCT ATG ATC TTG CAT TTG CCT GGA TGC TTG CCT TGT GTG 117 

AGG GTC GAT GAT CGG TCC ACC TGT TGG CAT GCT GTG ACC 156 

25 CCC ACC CTG GCC ATA CCA AAT GCT TCC ACG CCC GCA ACG 195 

GGA TTC CGC AGG CAT GTG GAT CTT CTT GCG GGC GCC GCA 234 

GTG GTT TGC TCA TCC CTG TAC ATC GGG GAC CTG TGT GGC 273 

TCT CTC TTT TTG GCG GGA CAA CTA TTC ACC TTT CAG CCC 312 

CGC CGT CAT TGG ACT GTG CAA GAC TGC AAC TGC TCC ATC 351 

TAT ACA GGC CAC GTC ACC GGC CAC AGG ATG GCT TGG GAC 390 

ATG ATG ATG AAC TGG TCA CCC ACA ACC ACT CTG GTC CTA 429 

TCT AGC ATC TTG AGG GTA CCT GAG ATT TGT GCG AGT GTG 468 

ATA TTT GGT GGC CAT TGG GGG ATA CTA CTA GCC GTT GCC 507 

TAC TTT GGC ATG GCT GGC AAC TGG CTA AAA GTT CTG GCT 546 

GTT CTG TTC CTA TTT GCA GGG GTT GAA GCA 576 

(2) INFORMATION FOR SEQ ID NO: 52: 



35 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK7 



(xi) 


SEQUENCE 


DESCRIF 


Tyr Gin 


Val 


Arg 


Asn 
5 


Ser 


Thr 


Gly 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tvr 








20 






His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cvs 








35 






Arg Cys 


Trp 


Val 


Ala 


Met 


Thr 


Pro 








50 








Lys Leu 


Pro 


Thr 


Ala 


Gin 


Leu 


Arg 








65 






Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 








80 






Gly Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 








95 






Arg His 


Trp 


Thr 


Thr 


Gin 


Gly 


Cys 








110 




His lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 








125 








Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 








140 








Gin Ala 


He 


Leu 


Asp 


Met 


He 


Ala 








155 








Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 








170 








Leu Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 



10 15 
;iu Ala Ala Asp Ala He Leu 
25 30 
tel Arg Glu Gly Asn Val Ser 
40 45 
?hr Val Ala Thr Arg Asp Gly 
55 60 
\rg His He Asp Leu Leu Val 
70 75 
jeu Tyr Val Gly Asp Leu Cys 
85 90 
jeu Phe Thr Phe Ser Pro Arg 
100 105 
isn Cys Ser He Tyr Pro Gly 
115 120 
*rp Asp Met Met Met Asn Trp 
130 135 
kla Gin Leu Leu Arg He Pro 
145 150 
\ly Ala His Trp Gly Val Leu 
160 165 
r al Gly Asn Trp Ala Lys Val 
175 180 

[ly Val Asp Ala 
185 190 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

35 Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp 

5 10 15 



» 
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Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Ala Asp Ala He Leu 

20 25 30 

His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser 

35 40 45 

Lys Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 

50 55 go 

Lys Leu Pro Ala Thr Gin Leu Arg Arg His He Asp Leu Leu Val 
5 65 70 75 

Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys 

80 85 90 

Gly Ser Val Phe Leu Val Gly Gin Leu Phe Thr Phe Ser Pro Arg 

95 100 105 

Arg His Trp Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 

110 H5 120 

His lie Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Ttd 
10 125 130 135 

Ser Pro Thr Ala Ala Leu Val Met Ala Gin Leu Leu Arg He Pro 

140 145 150 

Gin Ala He Leu Asp Met He Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly He Ala Tyr Phe Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

15 Val Val Val Leu Leu Leu phe Gly Val Asp Ala 

185 190 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR1 

25 



30 



35 



(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:! 


54: 


His Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr 


Cys Pro 






5 










10 








Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala 


His Ala 






20 










25 






Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


Arg Cys 






35 










40 






Trp 


Val 


Ala 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg 








50 










55 






Lys Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu 


Gly Ser 






65 










70 






Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Gly Ser 






80 










85 




Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 








95 










100 









15 
Leu 

30 
Ser 

45 
Gly 

60 
Val 

75 
Cys 

90 
Arg 
105 
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Arg His Trp 
His lie Thr 
Ser Pro Thr 
Gin Ala lie 
Ala Gly He 
Val Val Val 



IliX 


inr 






11U 




Gly 


ttJ 

His 


Arg 




125 




Thr 


Ala 


Leu 




140 




Leu 


Asp 


Met 




155 




Ala 


Tyr 


Phe 




170 




Leu 


Leu 


Leu 




185 







- 82 


- 


Asp 


Cys 


Asn 


Met 


Ala 


Trp 


Val 


Met 


Ala 


He 


Ala 


Gly 


Ser 


Met 


Val 


Phe 


Ala 


Gly 



Cys Ser lie 
115 

Asp Met Met 
130 

Gin Leu Leu 
145 

Ala His Trp 
160 

Gly Asn Trp 
175 

Val Asp Ala 
190 



Tyr Pro Gly 
120 

Met Asn Trp 

135 

Arg lie Pro 
150 

Gly Val Leu 
165 

Ala Lys Val 
180 



10 



15 



20 



25 



30 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



35 



His Gin 


Val 


Arg 


Asn 
5 


Ser 


Thr 


Gly 


Leu 


Tyr 
10 


His 


Val 


Thr Asn Asp 

15 


Cys Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala He Leu 








20 










25 




30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Thr Ser 








35 










40 




45 


Arg Cys 


Trp 


Val 


Ala 
50 


Val 


Thr 


Pro 


Thr 


Val 
55 


Ala 


Thr 


Arg Asp Gly 

60 


Lys Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu Leu Val 








65 










70 




75 


Gly Ser 


Ala 


Thr 


Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 

90 


Gly Ser 


Val 


Phe 


Leu 
95 


Val 


Gly 


Gin 


Leu 


Phe 
100 


Thr 


Phe 


Ser Pro Arg 
105 


His His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 


His He 






110 










115 






120 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Val 


Ala 


Gin 
145 


Leu 


Leu 


Arg He Pro 
150 


Gin Ala 


He 


Leu 


Asp 


Met 


lie 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 








155 










160 




165 


Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 








170 










175 




180 


Leu Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 










185 








190 
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(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:S6 



Tyr Gin 


Val 


Arg Asn 


Cys Pro 


Asn 


5 

Ser Ser 






20 


His Ala 


Pro 


Gly Cys 






35 


Arg Cys 


Trp 


Val Ala 






50 


Lys Leu 


Pro 


Ala Thr 






65 


Gly Ser 


Ala 


Thr Leu 






80 


Gly Ser 


Val 


Phe Leu 






95 


Arg Leu 


Trp 


Thr Thr 






110 


His lie 


Thr 


Gly His 






125 


Ser Pro 


Thr 


Thr Ala 






140 


Gin Ala 


lie 


Leu Asp 






155 


Ala Gly 


lie 


Ala Tyr 






170 


Leu Val 


Val 


Leu Leu 






185 



10 15 
Cle Val Tyr Glu Thr Ala Asp Ala He Leu 

25 30 
tel Pro Cys Val Arg Glu Gly Asn Thr Ser 

40 45 
let Thr Pro Thr Val Ala Thr Arg Asp Gly 

15 . _ . _ 50 55 60 

lln Leu Arg Arg Tyr He Asp Leu Leu Val 

70 75 
Jys Ser Ala Leu Tyr Val Gly Asp Leu Cys 

85 90 
ral Gly Gin Leu Phe Thr Phe Ser Pro Arg 

100 105 
rln Asp Cys Asn Cys Ser He Tyr Pro Gly 

115 120 
irg Met Ala Trp Asp Met Met Met Asn Trp 

130 135 
ieu Val Val Ala Gin Leu Leu Arg He Pro 

145 iso 
fet He Ala Gly Ala His Trp Gly Val Leu 

160 165 
'he Ser Met Val Gly Asn Trp Ala Lys Val 

175 180 
-eu Phe Ala Gly Val Asp Ala 

190 



(2) INFORMATION FOR SEQ ID NO: 57: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

35 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S18 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Tyr Gin Val Arg Asn Ser Thr Gly Leu Tyr His Val Thr Asn Asp 

~. „ 5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Thr He Leu 

20 25 30 

His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser 

5 » ^ m 35 40 45 

Arg Cys Trp Val Pro Val Ala Pro Thr Val Ala Thr Arg Asp Gly 

50 55 60 

Lys Leu Pro Ala Thr Gin Leu Arg Arg His He Asp Leu Leu Val 

65 70 75 

Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys 

80 85 go 

Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr He Ser Pro Ara 
10 95 ioo loi 

Arg His Trp Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 

110 H5 2.20 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Thr Thr Ala Leu Val He Ala Gin Leu Leu Arg Val Pro 

140 145 150 

15 Gin Ala Val Leu Asp Met He Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly He Ala Tyr Phe Ser Met Ala Gly Asn Trp Ala Lys Val 

170 175 iso 

Leu Leu Val Leu Leu Leu Phe Ala Gly Val Asp Ala 

185 190 

20 (2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

(D) TOPOLOGY: unknown 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp 
30 5 io !5 

Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Ala He Leu 

20 25 30 

His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Asp Gly Ala Pro 

35 40 45 

Lys Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 

50 55 60 

3 Lys Leu Pro Ala Thr Gin Leu Arg Arg His He Asp Leu Leu Val 

65 70 7 5 
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Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 








80 










85 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Arg His 






95 










100 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


His lie 






110 










115 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 








125 








130 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Gin Ala 






140 










145 


Val 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 








155 








160 


Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 








170 










175 


Leu lie 


Val 


Leu 


Leu 


Leu 


Phe 


Ser 


Gly 


Val 








185 








190 



90 
Arg 
105 
Gly 
120 
Trp 
135 
Pro 
150 
Leu 

165 
Val 
180 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNES S : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US11 



35 



(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:! 


59: 


Tyr Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His 


Val 


Thr Asn Asp 


Cys Pro 






5 










10 






15 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala He Leu 


His Thr 






20 










25 




30 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Ala Ser 








35 










40 




45 


Arg Cys 


Trp 


Val 


Ala 


Met 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg Asp Gly 








50 










55 






60 


Lys Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu Leu Val 


Gly Ser 






65 










70 




75 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 


Gly Ser 






80 










85 






90 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 


Arg His 






95 










100 






105 


Trp 


Thr 


Thr 


Gin 


Gly 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 


His He 






110 










115 






120 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu 


Arg He Pro 


Gin Ala 






140 










145 






150 


He 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 








155 










160 




165 
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Ala Gly lie Ala Tyr Phe Ser Met Val Gly Asn Tip Ala Lys Val 

170 175 180 

Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 60: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Dl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60 



15 



20 



25 



30 



Tyr Glu 


Val 


Arg 


Asn 
5 


Val 


Ser 


Gly 


Val Tyr 
10 


His Val Thr 


Asn Asp 
15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu Thr 


Ala Asp Met 


lie Met 


His Thr 






20 








25 


30 


Pro Gly 


Cys 


Val 


Pro 


Cys 


Val Arg 


Glu Asp Asn 


Ser Ser 








35 








40 


45 


Arg Cys 


Trp Val 


Ala 


Leu 


Thr 


Pro 


Thr Leu 


Ala Ala Arg 


Asn Gly 








50 








55 


60 


Asn Val 


Pro 


Thr 


Thr 


Ala 


lie 


Arg 


Arg His 


Val Asp Leu 


Leu Val 








65 








70 


75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met Tyr 


Val Gly Asp 


Leu Cys 








80 








85 


90 


Gly Ser 


Val 


Phe 


Leu 


lie 


Ser 


Gin 


Leu Phe 


Thr Leu Ser 


Pro Arg 


Arg His 






95 








100 




105 


Glu 


Thr 


Val 


Gin 


Glu 


Cys 


Asn Cys 


Ser lie Tyr 


Pro Gly 


His Val 






110 








115 


120 


Thr Gly 


His 


Arg 


Met 


Ala 


Trp Asp 


Met Met Met 


Asn Trp 








125 








130 




135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser Gin 


Leu Leu Arg 


lie Pro 








140 








145 


150 


Gin Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly Ala 


His Trp Gly 


Val Leu 








155 








160 


165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val Gly 


Asn Trp Ala 


Lys Val 








170 








175 


180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp Gly 










185 








190 





(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: D3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

/al Ser Gly Val Tyr Gin Val Thr Asn Asp 

10 15 
Cle Val Tyr Glu Thr Ala Asp Met He Met 

25 30 
Jal Pro Cys Val Arg Glu Asp Asn Ser Ser 

40 45 
jeu Thr Pro Thr Leu Ala Ala Arg Asn Ser 

55 60 
["hr He Arg Arg His Val Asp Leu Leu Val 

70 75 
:ys Ser Ala Met Tyr Val Gly Asp Leu Cys 

85 90 
ral Ser Gin Leu Phe Thr Phe Ser Pro Arg 

100 105 
Hn Glu Cys Asn Cys Ser He Tyr Pro Gly 

115 120 
irg Met Ala Trp Asp Met Met Met Asn Trp 

130 135 
■eu Val Val Ser Gin Leu Leu Arg He Pro 

145 150 
tet Val Ala Gly Ala His Trp Gly Val Leu 

160 165 
yr Ser Met Val Gly Asn Trp Ala Lys Val 

175 180 
ieu Phe Ala Gly Val Asp Gly 

190 

(2) INFORMATION FOR SEQ ID NO: 62: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

35 Cys Ser Asn Ser Ser He Val Tyr Glu Ala Val Asp Val He Met 

20 25 30 



20 



Tyr Glu 


Val Arg 


Asn 


Cys Ser 


Asn Ser 


5 

Ser 






20 


Hie Thr 


Pro Gly 


Cys 






35 


Arg Cys 


Trp Val 


Ala 






50 


Ser Val 


Pro Thr 


Thr 






65 


Gly Ala 


Ala Ala 


Phe 






80 


Gly Ser 


Val Phe 


Leu 






95 


Arg His 


Glu Thr 


Val 


His Val 




110 


Thr Gly 


His 






125 


Ser Pro 


Thr Ala 


Ala 






140 


Gin Ala 


Val Val 


Asp 






155 


Ala Gly 


Leu Ala 


Tyr 






170 


Leu lie 


Val Met 


Leu 






185 
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10 



15 



30 



His 


Thr 


Pro 


Vjrljr 


v~ys 
j ^ 


vax. 


irro 


cys 


val 


Arg 

A ft 


Glu Asn 


Asn Hxs Ser 

45 


Arg 


Cys 


irp 


VcLJL 


Ala 

/u.a 


iteu 


Tnr 


Pro 


Thr 


Leu 


Ala Ala 


Arg Asn Ala 


Ser 


lie 
















55 




60 




X XlxT 


xnr 


inr 


lie 


Arg 


Arg 


Hxs 


Val Asp 


Leu Leu Val 




















70 


mm 

75 


Gly Ala 


TV 1 a 


/Ua 


pne 


Cys 


Ser 


Ala 


Met 


Tyr 


Val Gly 


Asp Leu Cys 










ou 










85 


90 


Gly Ser 


Val 


file 


Leu 


val 


Ser 


Gin 


Leu 


Phe 


Thr Phe 


Ser Pro Arg 


Arg 


His 


bJ.U 




Q C 










100 




105 


(Til* 

ixir 


Axa 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser He 


Tyr Pro Gly 


His 








XJLU 










115 




120 


Val 


Ser 


Gly 


Hxs 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met 


Met Asn Trp 


Ser 








125 










130 




135 


Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Leu 


Ser 


Gin 


Leu Leu 


Arg He Pro 


Gin 


Ala 






140 










145 




150 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His Trp 


Gly Val Leu 


Ala 








155 










160 


165 


Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Ala 


Gly 


Asn Trp 


Ala Lys Val 










170 










175 


180 


Leu 


lie 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Gly 












185 










190 





(2) INFORMATION FOR SEQ ID NO: 63: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 
20 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK3 



25 



35 



(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:i 


63: 


Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


lie 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


Val 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met He Met 


His Thr 






20 










25 




30 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


Val 


Arg 


Glu 


Asn 


Asn Ser Ser 








35 










40 






45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Val 








50 










55 






60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 








65 










70 




75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 








95 










100 






105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr Pro Gly 








110 










115 






120 
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His Val Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Thr Ala Ala Leu Val Val Ser Gin Leu Leu Arg He Pro 

140 145 150 

Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

5 l* 70 175 180 

Leu He Val Met Leu Leu Phe Ala Gly Val Asp Gly 

185 190 

(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
15 (C) INDIVIDUAL ISOLATE: HK4 



20 



25 



30 



(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


His Glu 


Val 


His 


Asn 


Val 


Ser Gly 


lie Tyr 


His 








5 






10 




Cys Ser 


Asn 


Ser 


Ser 


He 


Val Tyr 


Glu Ala 


Ala 


His Thr 






20 




25 




Pro 


Gly 


Cys 


Val 


Pro Cys 


Val Arg 


Glu 








35 






40 




Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr Pro 


Thr Leu 


Ala 








50 






55 




Ser He 


Pro 


Thr 


Thr 


Thr 


He Arg 


Arg His 


Val 


Gly Ala 






65 






70 




Ala 


Ala 


Phe 


Cys 


Ser Ala 


Met Tyr 


Val 








80 






85 




Gly Ser 


Val 


Phe 


Leu 


Val 


Ser Gin 


Leu Phe 


Thr 


Arg His 






95 






100 




Glu 


Thr 


Val 


Gin 


Asp Cys 


Asn Cys 


Ser 


His Val 






110 






115 




Ser 


Gly 


His 


Arg 


Met Ala 


Trp Asp 


Met 








125 






130 




Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val Val 


Ser Gin 


Leu 








140 






145 




Gin Ala 


Val 


Met 


Asp 


Met 


Val Ala 


Gly Ala 


His 








155 






160 




Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser Met 


Val Gly 


Asn 








170 






175 




Leu He 


Val 


Met 


Leu 


Leu 


Phe Ala 


Gly Val 


Asp 








185 






190 



15 
Met 

30 
Ser 

45 
Ala 

60 
Val 

75 
Cys 

90 
Arg 
105 
Gly 
120 
Trp 
135 
Pro 
150 
Leu 
165 
Val 
180 



35 



1 
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(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 



Tyr Glu 


Val Arg 


Asn 


Val 


Ser 


Gly Val 


Tyr His Val 


Thr 


Asn Asp 


Cys Ser 




5 








10 






15 


Asn Leu 


Ser 


lie 


Val 


Tyr Glu 


Thr Thr Asp 


Met 


lie Met 


His Thr 




20 








25 






30 


Pro Gly 


Cys 


Val 


Pro 


Cys Val 


Arg Glu Asn 


Asn 


Ser Ser 






35 








40 






45 


Arg Cys 


Trp Val 


Ala 


Leu 


Ala 


Pro Thr 


Leu Ala 


Ala 


Arg Asn Ala 


Ser Val 




50 








55 






60 


Pro Thr 


Thr 


Ala 


He 


Arg Arg 


His Val 


Asp 


Leu 


Leu Val 


Gly Ala 




65 








70 




75 


Ala Ala 


Phe 


Cys 


Ser 


Ala Met 


Tyr Val 


Gly 


Asp 


Leu Cys 






80 








85 


90 


Gly Ser 


Val Phe 


Leu 


Val 


Ser 


Gin Leu 


Phe Thr 


Phe 


Ser 


Pro Arg 


Arg His 




95 








100" 






105 


Glu Thr 


Val 


Gin 


Asp 


Cys Asn 


Cys Ser 


He 


Tyr Pro Gly 


His Val 




110 








115 






120 


Thr Gly His 


Arg 


Met 


Ala Trp 


Asp Met 


Met 


Met 


Asn Trp 






125 








130 






135 


Ser Pro 


Thr Thr 


Ala 


Leu 


Val 


Val Ser 


Gin Leu 


Leu 


Arg 


He Pro 






140 








145 




150 


Gin Ala 


Val Val 


Asp 


Met 


Val 


Ala Gly 


Ala His 


Trp 


Gly Val Leu 


Ala Gly 




155 








160 




165 


Leu Ala 


Tyr 


Tyr 


Ser 


Met Val 


Gly Asn Trp 


Ala 


Lys Val 






170 








175 






180 


Leu lie 


Val Met 


Leu 


Leu 


Phe 


Ala Gly 


Val Asp Gly 










185 








190 









(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) 

35 



ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 

(C) INDIVIDUAL ISOLATE: HK8 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 66: 

Tyr Glu Val Arg Asn Val Ser Gly lie Tyr His Val Thr Asn Asp 

5 10 15 

Cys Ser Asn Ser Ser He Val Tyr Glu Thr Ala Asp Met He Met 

20 25 30 

His Thr Pro Gly Cys Met Pro Cys Val Arg Glu Asn Asn Ser Ser 
5 35 40 45 

Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Val 

50 55 60 

Ser Val Pro Thr Thr Thr He Arg Arg His Val Asp Leu Leu Val 

65 70 75 

Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Cys 

80 85 90 

Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg 
10 95 100 105 

Arg His Glu Thr Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 

HO H5 120 

His Val Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg He Pro 

140 145 150 

15 Gin Ala He Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

Leu He Val Met Leu Leu Phe Ala Gly Val Asp Gly 

185 190 

20 (2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

25 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND5 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 
30 5 10 15 

Cys Ser Asn Ser Ser He Val Tyr Glu Ala Ala Asp Met He Met 

20 25 30 

His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ser Ser 

35 40 45 

Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala 

50 55 60 

35 Ser Val Ser Thr Thr Thr He Arg His His Val Asp Leu Leu Val 

65 70 75 
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25 



30 



Gly Ala 




TV 1 a 
Aia 


irne 
on 


uys 


ser 


Ala 


Met 


Tyr 
85 


Gly Ser 


Val 


xrlie 


lieu 


vax 


ser 


Gin 


Leu 


Phe 


At~ct TTi o 






95 










100 




11117 


Vol 


bin 


Asp 


cys 


Asn 


Cys 


His Val 






11 A 
1XU 










115 






XIX s 




lYie l 


Axa 


Trp 


Asp 








125 








130 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Gin Ala 






140 










145 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


Ala Gly 






155 








160 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Met 


Val 


Gly 
175 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 








185 








190 



90 
Arg 
105 
Gly 
120 
Trp 
135 
Pro 
150 
Leu 
165 
Val 
180 



10 



(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 



35 



Tyr Glu 


Val 


Arg Asn 


Val 


Ser 


Gly 


Val Tyr 


His 






5 








10 




Cys Ser 


Asn 


Ser Ser 


lie 


Val 


Tyr 


Glu Ala 


Ala 


His Thr 




20 






25 




Pro 


Gly Cys 


Val 


Pro 


Cys 


Val Arg 


Glu 






35 








40 




Ser Cys 


Trp 


Val Ala 


Leu 


Thr 


Pro 


Thr Leu 


Ala 






50 








55 




Ser Val 


Pro 


Thr Thr 


Thr 


He 


Arg 


Arg His 


Val 






65 






70 




Gly Ala 


Ala 


Ala Phe 


Cys 


Ser 


Ala 


Met Tyr 


Val 


Gly Ser 




80 








85 




Val 


Phe Leu 


Val 


Ser 


Gin 


Leu Phe 


Thr 






95 








100 




Arg His 


Glu 


Thr Val 


Gin 


Asp 


Cys 


Asn Cys 


Ser 


His Val 




110 








115 




Ser 


Gly His 


Arg 


Met 


Ala 


Trp Asp 


Met 






125 








130 




Ser Pro 


Thr 


Ala Ala 


Leu 


Val 


Val 


Ser Gin 


Leu 






140 








145 




Gin Ala 


Val 


Val Asp 


Met 


Val 


Ala 


Gly Ala 


His 






155 








160 





15 
Met 

30 
Ser 

45 
Ala 

60 
Val 

75 
Cys 

90 
Arg 
105 
Gly 
120 

Trp 

135 

Pro 

150 

Leu 
165 
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Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

Leu lie Val Met Leu Leu Phe Ala Gly Val Asp Gly 

185 igo 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: P10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 

„ 5 10 15 

15 Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Ala Asp Met lie Met 

20 25 30 

His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser 

35 40 45 

Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ser 

50 55 60 

Ser Val Pro Thr Thr Ala He Arg Arg His Val Asp Leu Leu Val 

65 70 75 

Gly Ala Ala Ala Phe Cys Ser Ala Met Tyr Val Gly Asp Leu Cys 

80 85 90 

Gly Ser Val Leu Leu Val Ser Gin Leu Phe Thr Phe Ser Pro Arg 

95 100 105 

Arg His Trp Thr Val Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 
. HO 115 120 

His Val Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 
25 I 25 130 135 

Ser Pro Thr Ala Ala Leu Val Val Ser Gin Leu Leu Arg He Pro 

140 145 150 

Gin Ala He Leu Asp Val Val Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

Leu He Val Met Leu Leu Phe Ala Gly Val Asp Gly 
30 185 190 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 



20 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 



Tyr Glu 


Val Arg 


Asn 


Val 


Ser 


Gly 


Ala 


Tyr 


His Val Thr Asn Asp 






5 










10 


15 


Cys Ser 


Asn Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Ala Asp Val He Met 


His Thr 




20 










25 


30 


Pro Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Glu Gly Asn Ser Ser 






35 










40 


45 


Gin Cys 


Trp Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala Ala Arg Asn Ala 






50 










55 


60 


Thr Val 


Pro Thr 


Thr 


Thr 


lie 


Arg 


Arg 


His 


Val Asp Leu Leu Val 






65 










70 


75 


Gly Ala 


Ala Val 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val Gly Asp Leu Cys 


Gly Ser 




80 










85 


90 


Val Phe 


Leu 


He 


Ser 


Gin 


Leu 


Phe 


Thr He Ser Pro Arg 






95 










100 


105 


Arg His 


Glu Thr 


Val 


Gin 


Asn 


Cys 


Asn 


Cys 


Ser He Tyr Pro Gly 


His Val 




110 










115 


120 


Thr Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met Met Asn Trp 






125 










130 


135 


Ser Pro 


Thr Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu Leu Arg He Pro 






140 










145 


150 


Gin Ala 


Val Met 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His Trp Gly Val Leu 






155 










160 


165 


Ala Gly 


Leu Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn Trp Ala Lys Val 






170 










175 


180 


Leu lie 


Val Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Gly 






185 










190 



(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S45 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 



35 



Tyr Glu Val Arg Asn Val Ser Gly Ala Tyr His Val Thr Asn Asp 

5 10 15 

Cys Ser Asn Ser Ser He Val Tyr Glu Ala Val Asp Val He Leu 

20 25 30 
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10 



15 



20 



25 



30 



His Thr 


Pro 


Gly 


Cvs 


Val 


Pro 
x. x. 




Val 
v ctx 




uiu ash /isn oci oer 


Arg Cys 






35 










40 


AC 


Tip 


Val 


Ala 


Leu 

«UJ Vv KJk 


Thr 

X. XXX. 






iJcU 




Ser Val 






50 










55 




Pro 


Thr 


Thr 

AAA* 


Thr 


lie 

•X. «l, v_» 




A ttt 


X1J-B 


vai iisp lieu ijeu vai 


Gly Ala 






65 










70 


/D 


Ala 


Ala 


Phe 


Cvs 


Ser 


Ala 




Ayr 


"XTja 1 ^Zll t t Aar% T.aii P\fe 

vclx uiy >isjp ueu i-ys 


Gly Ser 






80 










85 


-7 U 


Val 


Phe 


Leu 


Val 


Ser 

«X- 


Gin 


Li Oil 


XVXXC? 


T'Ht* Dhfi Got* Dm A -r^rr 
•IX IX- Jrllc o6X7 JriO /ixy 


Arg His 






95 










100 

^ V V/ 




Glu 


Thr 


Val 


Gin 




w y o 






oer x x e Ayr rro vj J.y 


His Val 






110 












120 


Thr 

* XXX. 


(11 v 


XXX 0 


A vex 




Aj.a 


irp 


ASp 


Met Met Met Asn Trp 


Ser Pro 






125 










130 


135 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu Leu Arg lie Pro 


Gin Ala 






140 










145 


150 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His Trp Gly Val Leu 


Ala Gly 






155 










160 


165 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn Trp Ala Lys Val 


Leu lie 






170 










175 


180 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Gly 








185 










190 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72 



35 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Met Tyr 


His 


Val 


Thr Asn Asp 


Cys Ser 






5 








10 






15 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu Ala 


Ala 


Asp 


Met He Met 


His Thr 






20 








25 




30 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val Arg 
40 


Glu 


Asn 


Asn Ser Ser 

45 


Arg Cys 


Trp 


Val 


Ala 
50 


Leu 


Thr 


Pro 


Thr Leu 
55 


Ala 


Ala 


Arg Asn Ser 

60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg His 


Val 


Asp 


Leu Leu Val 








65 








70 




75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met Tyr 


Val 


Gly 


Asp Leu Cys 


Gly Ser 






80 








85 






90 


Val 


Phe 


Leu 
95 


Val 


Ser 


Gin 


Leu Phe 
100 


Thr 


Phe 


Ser Pro Arg 
105 


Arg Tyr 


Glu 


Thr 


Val 
110 


Gin 


Asp 


Cys 


Asn Cys 
115 


Ser 


He 


Tyr Pro Gly 
120 
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Arg Val Thr 
Ser Pro Thr 
Gin Ala He 
Ala Gly Leu 
Leu He Val 



Gly His Arg 

125 
Thr Ala Leu 

140 
Val Asp Met 

155 
Ala Tyr Tyr 

170 
Met Leu Leu 

185 



- 96 - 
Met Ala Trp 
Val Val Ser 
Val Ala Gly 
Ser Met Val 
Phe Ala Gly 



Asp Met Met 
130 

Gin Leu Leu 
145 

Ala His Trp 
160 

Gly Asn Trp 
175 

Val Asp Gly 
190 



Met Asn Trp 
135 

Arg He Pro 
150 

Gly Val Leu 
165 

Ala Lys Val 
180 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
15 (C) INDIVIDUAL ISOLATE: SW2 

SEQUENCE DESCRIPTION: SEQ ID NO: 73: 



20 



(xi) 


SEQUENCE 


Tyr Glu 


Val 


Arg 


Asn 


Val 


Cys Ser 


Asn 


Ser 


5 

Ser 


He 








20 




His Thr 


Pro Gly 


Cys 


Val 








35 




Arg Cys 


Trp Val 


Ala 


Leu 








50 




Ser Val 


Pro 


Thr 


Thr 


Thr 








65 




Gly Ala 


Ala 


Ala 


Phe 


Cys 








80 


Gly Ser 


Val 


Phe 


Leu 


Val 








95 




Arg His 


Glu 


Thr 


Val 


Gin 








110 




His Val 


Ser Gly 


His 


Arg 








125 




Ser Pro 


Thr 


Ala 


Ala 


Leu 








140 




Gin Ala 


Val 


Val 


Asp 


Met 








155 




Ala Gly 


Leu 


Ala 


Tyr 


Tyr 








170 




Leu He 


Val 


Met 


Leu 


Leu 



10 15 
fal Tyr Glu Thr Ala Asp Met He Met 

25 30 
?ro Cys Val Arg Glu Ala Asn Ser Ser 

40 45 
[■hr Pro Thr Leu Ala Ala Arg Asn Thr 

55 60 
He Arg Arg His Val Asp Leu Leu Val 

70 75 
Ser Val Met Tyr Val Gly Asp Leu Cys 

25 m „„ 80 85 90 

™" ° ** "~ " ">er Gin Leu Phe Thr Phe Ser Pro Arg 

100 105 
Lsp Cys Asn Cys Ser He Tyr Pro Gly 

115 120 
let Ala Trp Asp Met Met Met Asn Trp 

130 135 
r al Val Ser Gin Leu Leu Arg lie Pro 
30 140 145 150 

r al Ala Gly Ala His Trp Gly Val Leu 

160 165 
er Met Val Gly Asn Trp Ala Lys Val 

175 180 
he Ala Gly Val Asp Gly 
185 190 
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o 



10 



15 



20 



25 
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(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



Tyr Glu 


Val Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


Tyr Val 


Thr Asn Asp 


Cys Ser 




5 










10 






15 


Asn Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala Asp 


Met He Met 


His Thr 




20 










25 






30 


Pro Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Ser 


Asn Ser Ser 

45 


Arg Cys 


Trp Val 


Ala 


Leu 


Thr 


Pro 


Thr 




Ala 


Ala 


Arg Asn Ala 


Ser Val 




50 










55 






60 


Pro Thr 


Lys 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 


Gly Ala 




65 










70 




75 


Ala Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 


Gly Ser 




80 










85 






90 


Val Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 


Arg His 




95 










100 






105 


Glu Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 


His Val 




110 










115 






120 


Thr Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro 


Thr Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg He Pro 


Gin Ala 




140 










145 






150 


Val Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 






155 










160 




165 


Ala Gly 


Leu Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 






170 










175 




180 


Leu lie 


Val Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Gly 








185 










190 









(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) 

35 



ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T10 
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10 



15 



(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:' 


/D . 


Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Met 


Tyr 


His 


Val 


xnr Asn Asp 








5 










10 








Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Phe 


Glu 


Ala 


Ala Asp 


Leu 11 e Met 


His Thr 






20 










25 








Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu Gly 


Asn Ser Ser 








35 










40 








Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Tnr 








50 










55 








Ser Val 


Pro 


Thr 


Thr 


Thr 


lie 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 








65 










70 




75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 




90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 








95 










100 






105 


Arg His 


Glu 


Thr 


Leu 


Gin 


Asp 


Cvs 


Asn 


Cvs 


Ser 


lie 


Tyr Pro Gly 


His Leu 






110 










115 






120 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg lie Pro 








140 










145 






150 


Gin Ala 


Val 


Met 


Asp 


Met 


Val 


Thr 


Gly 


Ala 


His 


Trp 


Gly Val Leu 








155 










160 




165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Ala 


Gly 


Asn Trp 


Ala Lys Val 








170 










175 






180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Gly 










185 










190 









20 (2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Tyr Glu Val Arg Asn Val Ser Gly Met Tyr His Val Thr Asn Asp 
30 5 10 15 

Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Ala Asp Met lie Met 

20 25 30 

His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser 

35 40 45 

Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala 

50 55 60 

q - Ser Val Pro Thr Thr Thr lie Arg Arg His Val Asp Leu Leu Val 

65 70 75 
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Glv Ala 


A1 a 

ill Ct 


± I1X 


irXie 


v*ys 


ser 








AO 

O v 






Glv Ser 


V CL J_ 




iiCU 


lie 


CJ ^"N w» 

ber 














Gin His 


Glu 




V CLJL 




^\ M% 

ASp 








1 1 n 




His Val 


Ser 


Gly 


His 


Arg 


Met 








125 






Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 








140 






Gin Ala 


Val 


Met 


Asp 


Met 


Val 








155 






Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 








170 






Leu lie 


Val 


Leu 


Leu 


Leu 


Phe 



10 185 



■ 99 












Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 






85 






90 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 


• 




100 






105 


Cys 


Asn 


Cys 


Ser 


lie 


Tyr Pro Gly 






115 






120 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 












135 


Val 


Ser 


Gin 


Leu 


Leu 


Arg lie Pro 






145 






150 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 






160 






165 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 






175 






180 


Ala 


Gly 


Val 


Asp Gly 





190 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
15 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77 



25 



30 



35 



Ala Gin 


Val 


Arg 


Asn 


Thr 


Ser 


Arg 


Gly 


Tyr 


Met 


Val 


Thr Asn Asp 


Cys Ser 






5 










10 






15 


Asn 


Glu 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Gin 


Ala 


Ala Val Leu 


His Val 






20 










25 






30 


Pro 


Gly 


Cys 


lie 


Pro 


Cys 


Glu 


Arg 


Leu 


Gly 


Asn Thr Ser 


Arg Cys 






35 










40 




45 


Trp 


He 


Pro 


Val 


Thr 


Pro 


Asn 


Val 


Ala 


Val 


Arg Gin Pro 


Gly Ala 






50 










55 






60 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His 


He 


Asp 


Met Val Val 








65 










70 




75 


Met Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 


Gly Gly 






80 










85 






90 


Val 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


He 


Val 


Ser Pro Arg 


Arg His 






95 










100 






105 


Trp 


Phe 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 


Thr He 






110 










115 






120 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 


Ser Pro 






125 










130 






135 


Thr 


Ala 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala 


Met 


Arg Val Pro 


Glu Val 






140 










145 






150 


lie 


He 


Asp 


He 


He 


Gly 


Gly 


Ala 


His 


Trp 


Gly Val Met 








155 










160 




165 
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Phe Gly Leu Ala Tyr 

170 

lie Val lie Leu Leu 

185 



- 100 - 

Phe Ser Met Gin Gly 

175 

Leu Ala Ala Gly Val 

190 



Ala Trp Ala Lys Val 

180 

Asp Ala 



(2) INFORMATION FOR SEQ ID NO: 78: 

5 

(l) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 





Ala 


Gin 


Val 


Lys 


Asn 

5 


Thr 


Thr 


Asn 


Ser 


Tyr 
10 


Met 


Val 


Thr Asn Asp 

15 


15 


Cys 


Ser 


Asn 


Asp 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Gin 


Ala 


Ala Val Leu 










20 










25 






30 




His 


Val 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Glu 


Lys 
40 


Thr 


Gly 


Asn Thr Ser 

45 




Arg 


Cys 


Trp 


He 


Pro 
50 


Val 


Ser 


Pro 


Asn 


Val 
55 


Ala 


Val 


Arg Gin Pro 

60 




Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His 


lie 


Asp 


Met Val Val 


20 










65 










70 




75 


Met 


Ser 


Ala 


Thr 


Leu 

80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 

90 




Gly Gly 


Val 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


He 


Val 


Ser Pro Gin 












95 










100 






105 




His 


His 


Trp 


Phe 


Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr Pro Gly 
120 




Thr 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 


25 










125 










130 






135 


Ser 


Pro 


Thr 


Ala 


Thr 
140 


Met 


He 


Leu 


Ala 


Tyr 

145 


Ala 


Met 


Arg Val Pro 
150 




Glu 


Val 


lie 


Leu 


Asp 


He 


Val 


Ser 


Gly 


Ala 


His 


Trp 


Gly Val Met 












155 










160 




165 




Phe 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala 


Trp 


Ala Lys Val 












170 










175 




180 




Val 


Val 


lie 


Leu 


Leu 


Leu 


Ala 


Ala 


Gly 


Val 


Asp 


Ala 





30 185 190 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
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10 



15 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Phr Ser Tyr Met Val Thr Asn Asp 

10 15 

Crp Gin Leu Gin Ala Ala Val Leu 

25 30 

?ys Glu Arg Val Gly Asn Ala Ser 

40 45 

Pro Asn Val Ala Val Gin Arg Pro 

55 60 

Irg Thr His lie Asp Met Val Val 

70 75 
□.a Leu Tyr Val Gly Asp Leu Cys 
85 90 
Jin Met Phe lie lie Ser Pro Gin 
100 105 
!ys Asn Cys Ser lie Tyr Pro Gly 
115 120 
Ha Trp Asp Met Met Met Asn Trp 
130 135 
ieu Ala Tyr Ala Met Arg Val Pro 
145 iso 
!er Gly Ala His Trp Gly Val Met 
160 165 
let Gin Gly Ala Trp Ala Lys Val 
175 180 
J.a Gly Val Asp Ala 
190 

(2) INFORMATION FOR SEQ ID NO: 80: 

25 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: US10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Val Gin Val Lys Asn Thr Ser Thr Ser Tyr Met Val Thr Asn Asp 

5 10 is 

35 Cys Ser Asn Asp Ser lie Thr Trp Gin Leu Glu Ala Ala Val Leu 

20 25 30 



20 



Aid UJ. LL 


Val 


jjys 


Asn 

c 


xnr 


Ser 


\*jb OCX 


Asn 




D 


lie 


Thr 








on 






file Val 


f ro 


Gly 


Cys 


val 


Pro 








JO 






Arg \jys 


Trp 


lie 


Pro 


Val 


Ser 








50 






Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 








65 






Met Ser 


Ala 


Thr 


Leu 


Cys 


Ser 








80 




Gly Gly 


Val 


Met 


Leu 


Ala 


Ala 








95 






His His 


Trp 


Phe 


Val 


Gin 


Glu 








110 






Thr lie 


Thr 


Gly 


His 


Arg 


Met 








125 






Ser Pro 


Thr 


Thr 


Thr 


Met 


lie 








140 






Glu Val 


lie 


lie 


Asp 


He 


lie 








155 






Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 








170 






Val Val 


lie 


Leu 


Leu 


Leu 


Thr 








185 
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10 



15 



25 



30 



35 



His Val 


Pro Gly Cys Val 


Pro 


Cys 


Glu 




Val G3 v A cm Thr" Cor* 






35 








40 




Arg Cys 


Trp 


He Pro Val 


Ser 


Pro 


noli 


Val 


nld VcLX VjXII ai^ irro 


Glv Ala 




50 










fiO 


Leu 


Thr Gin Gly Leu Arg 


£11.1. 


H J. 5 


iic Asp roec va± vax 






65 








70 
/ \j 


/ D 


Met Ser 


Ala 


Thr Leu Cys 


Ser 


Ala 


lieu 


ryr 


vaj. uiy Asp Fne uys 






80 








O J 


y u 


Gly Gly 


Met 


Met Leu Ala 


Ala 


Gin 


Mat* 




vai oer irro Arg 


His His 




95 








ion 

■L \J \J 




Ser 


Phe Val Gin 


Glu 


Cys 


noU 




oer xjLe ±yr jt*ro tjj.y 


Thr lie 




110 








lie 


120 


Thr Gly His Arg Met Ala 


* 1 'yr\ 

irp 


IV 0VN 

ASp 


wee wee Met Asn Trp 






125 








130 


135 


Ser Pro 


Thr 


Ala Thr Leu 


lie 


Leu 


Ala 


Tyr 


Val Met Arg Val Pro 


Glu Val 




140 








145 


150 


He 


lie Asp He 


He 


Ser 


Gly 


Ala 


His Trp Gly Val Leu 


Phe Gly 




155 








160 


165 


Leu 


Ala Tyr Phe 


Ser 


Met 


Gin 


Gly 


Ala Trp Ala Lys Val 






170 








175 


180 


Val Val 


lie 


Leu Leu Leu 


Ala 


Ala 


Gly 


Val 


Asp Ala 






185 






190 



(2) INFORMATION FOR SEQ ID NO: 81: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRAND EDNES S : unknown 
20 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 



Val Glu 


Val 


Arg 


Asn 
5 


He 


Ser 


Ser 


Ser 


Tyr 
10 


Tyr 


Ala 


Thr Asn Asp 

15 


Cys Ser 


Asn 


Asn 


Ser 


lie 


Thr 


Trp 


Gin 


Leu 


Thr 


Asp 


Ala Val Leu 


His Leu 






20 










25 




30 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Glu 


Asn 
40 


Asp 


Asn 


Gly Thr Leu 

45 


Arg Cys 


Trp 


He 


Gin 
50 


Val 


Thr 


Pro 


Asn 


Val 
55 


Ala 


Val 


Lys His Arg 

60 


Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg 


Thr 


His 


Val 


Asp 


Val He Val 








65 










70 




75 


Met Ala 


Ala 


Thr 


Val 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Val Cys 

90 


Gly Ala 


Val 


Met 


lie 


Val 


Ser 


Gin 


Ala 


Leu 


He 


He 


Ser Pro Glu 


Arg His 






95 










100 






105 


Asn 


Phe 


Thr 
110 


Gin 


Glu 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr Gin Gly 
120 
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His lie Thr 
Ser Pro Thr 
Glu Leu Ala 
Phe Gly Leu 
He Ala He 



Gly His Arg 

125 
Leu Thr Met 

140 
Leu Gin Val 

155 

Ala Tyr Phe 

170 
Leu Leu Leu 

185 
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Met Ala Tip 
lie Leu Ala 
Val Phe Gly 
Ser Met Gin 
Val Ala Gly 



Asp Met Met 
130 

Tyr Ala Ala 
145 

Gly His Trp 
160 

Gly Ala Trp 
175 

Val Asp Ala 
190 



Leu Asn Trp 
135 

Arg Val Pro 
150 

Gly Val Val 
165 

Ala Lys Val 
180 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS : 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
15 (C) INDIVIDUAL ISOLATE: DK11 



20 



25 



30 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


Val 


Glu 


Val 


Arg 


Asn 


Thr 


Ser 


Ser 


Ser 


Tyr 


Tyr 


Cys 








5 










10 


Ser 


Asn 


Asn 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Thr 


His 








20 








25 




Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Asn 


Asp 


His 








35 










40 


Cys 


Trp 


He 


Gin 


Val 


Thr 


Pro 


Asn 


Val 


Ala 










50 










55 




Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg 


Ala 


His 


He 










65 








70 




Met 


Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 










80 










85 




Gly Ala 


Val 


Met 


lie 


Val 


Ser 


Gin 


Ala 


Phe 


lie 


His 








95 










100 




His 


His 


Phe 


Thr 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


His 








110 










115 




lie 


Thr Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 










125 








130 




Ser 


Pro 


Thr 


Leu 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala 










140 










145 




Glu 


Leu 


Val 


Leu 


Glu 


Val 


Val 


Phe 


Gly 


Gly 


His 










155 










160 




Phe 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala 










170 










175 




He 


Ala 


He 


Leu 


Leu 


Leu 


Val 


Ala 


Gly 


Val 


Asp 



15 

Leu 
30 
Leu 

45 
Arg 
60 
Val 
75 
Cys 
90 
Glu 
105 
Gly 
120 
Trp 
135 
Pro 
150 
Val 
165 
Val 
180 

Asp Ala 

185 190 
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(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 



Val 


Glu 


Val 


Arg 






^5 

ber 


ber 


Ser Tyr Tyr Ala Thr 


Asn Asp 


Cys 








5 








10 




15 


Ser 


Asn 


Ser 


fa ex 


lie 


Thr 


Trp 


Gin Leu Thr Asn Ala 


Val 


Leu 


His 








20 








25 




30 


Leu 


Pro Gly 


Cys 


Val 


Pro 


Cys 


Glu Asn Asp Asn Gly 


Thr 


Leu 


His 








35 








40 




45 


Cys 


Trp 


He 


Gin 
50 


Val 


Thr 


Pro 


Asn Val Ala Val Lys 
55 


His 


Arg 
60 


Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg 


Ala His Val Asp Met 


He 


Val 










65 








70 




75 


Met 


Ala 


Ala 


Thr 


Val 
80 


Cys 


Ser 


Ala 


Leu Tyr Val Gly Asp 
85 


Met 


Cys 
90 


Gly Ala 


Val 


Met 


He 


Val 


Ser 


Gin 


Ala Phe He He Ser 


Pro 


Glu 




His 






95 








100 




105 


Arg 


Asn 


Phe 


Thr 


Gin 


Glu 


Cys 


Asn Cys Ser He Tyr 


Gin Gly 


Arg 


lie 






110 








115 




120 


Thr Gly 


His 


Arg 


Met 


Ala 


Trp Asp Met Met Leu 


Asn Trp 


Ser 








125 








130 




135 


Pro 


Thr 


Leu 


Thr 


Met 


He 


Leu 


Ala Tyr Ala Ala Arg 


Val 


Pro 


Glu 








140 








145 




150 


Leu 


Val 


Leu 


Glu 


Val 


Val 


Phe 


Gly Gly His Trp Gly 


Val 


Val 


Phe 








155 








160 




165 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin Gly Ala Trp Ala 


Lys 


val 


He 








170 








175 


180 


Ala 


lie 


Leu 


Leu 
185 


Leu 


Val 


Ala 


Gly Val Asp Ala 

190 







(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
35 (C) INDIVIDUAL ISOLATE: T8 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

* 

Val Glu Val Arg Asn Thr Ser Phe Ser Tyr Tyr Ala Thr Asn Asp 

5 10 15 

Cys Ser Asn Asn Ser lie Thr Trp Gin Leu Thr Asn Ala Val Leu 

20 25 30 

His Leu Pro Gly Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu 

5 35 40 45 

Arg Cys Trp He Gin Val Thr Pro Asn Val Ala Val Lys His Arg 

50 55 60 

Gly Ala Leu Thr His Asn Leu Arg Thr His Val Asp Val He Val 

65 70 75 

Met Ala Ala Thr Val Cys Ser Ala Leu Tyr Val Gly Asp Val Cys 

80 85 90 

Gly Ala Val Met He Ala Ser Gin Ala Phe He He Ser Pro Glu 
10 95 ioo 105 

Arg His Asn Phe Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly 

110 us 120 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Leu Asn Trp 

HO 115 120 

Ser Pro Thr Leu Thr Met He Leu Ala Tyr Ala Ala Arg Val Pro 

125 130 135 

15 Glu Leu Val Leu Glu Val Val Phe Gly Gly His Trp Gly Val Val 

140 145 150 

Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp Ala Lys Val 

155 160 165 

He Ala He Leu Leu Leu Val Ala Gly Val Asp Ala 

170 175 

20 (2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

25 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Val Glu Val Lys Asp Thr Gly Asp Ser Tyr Met Pro Thr Asn Asp 
30 5 10 15 

Cys Ser Asn Ser Ser He Val Trp Gin Leu Glu Gly Ala Val Leu 

20 25 30 

His Thr Pro Gly Cys Val Pro Cys Glu Arg Thr Ala Asn Val Ser 

35 40 45 

Arg Cys Trp Val Pro Val Ala Pro Asn Leu Ala He Ser Gin Pro 

50 55 60 

3 Gly Ala Leu Thr Lys Gly Leu Arg Ala His He Asp He He Val 

65 70 75 
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Met Ser Ala 
Gly Ala Leu 
His His Thr 
Arg lie Thr 
Ser Pro Thr 
Glu Val He 
Phe Gly Leu 
He Val He 



Thr Val Cys 
80 

Met Leu Ala 
95 

Phe Val Gin 

110 
Gly His Arg 

125 
Thr Thr Met 

140 
Leu Asp He 

155 
Ala Tyr Phe 

170 
Leu Leu Leu 

185 



- 106 - 
Ser Ala Leu 
Ala Gin Val 
Glu Cys Asn 
Met Ala Trp 
Leu Leu Ala 
Val Thr Gly 
Ser Met Gin 
Thr Ala Gly 



Tyr Val Gly 
85 

Val Val Val 
100 

Cys Ser lie 
115 

Asp Met Met 
130 

Tyr Leu Val 
145 

Gly His Trp 
160 

Gly Ser Trp 
175 

Val Glu Ala 
190 



Asp Val Cys 

90 

Ser Pro Gin 
105 

Tyr Pro Gly 
120 

Met Asn Trp 
135 

Arg lie Pro 
150 

Gly Val Met 
165 

Ala Lys Val 
180 



(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK12 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86 



25 



30 



35 



Leu 


Glu 


Trp 


Arg 


Asn 


Val 


Ser 


Gly 


Leu 


Tyr 


Val 


Leu 


Thr Asn Asp 


Cys 








5 










10 






15 


Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Val lie Leu 


His 








20 










25 


30 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly 


Asn Thr Ser 


Thr 








35 










40 


45 


Cys 


Trp 


Thr 


Ser 
50 


Val 


Thr 


Pro 


Thr 


Val 
55 


Ala 


Val 


Arg Tyr Val 

60 


Gly 


Ala 


Thr 


Thr 


Ala 


Ser 


lie 


Arg 


Ser 


His 


Val 


Asp 


Leu Leu Val 










65 










70 




75 


Gly 


Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Val Cys 


Gly 








80 










85 






90 


Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg Pro Arg 




His 






95 










100 






105 


Arg 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr Pro Gly 


His 








110 










115 






120 


Leu 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser 


Pro 


Ala 


Val 


Gly 


Met 


Val 


Val 


Ala 


His 


Val 


Leu 


Arg Leu Pro 


Gin 








140 










145 






150 


Thr 


Leu 


Phe 


Asp 

155 


lie 


lie 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly He Met 
165 
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Ala Gly Leu Ala Tyr Tyr Ser Met Gin Gly Asn Trp Ala Lys Val 

170 175 180 

Ala lie He Met Val Met Phe Ser Gly Val Asp Ala 

185 190 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK10 



(xi) 


SEQUENCE 


DESCRIP 


Leu Glu 


Trp 


Arg Asn 
5 


Val 


Ser 


Gly 


Cys Pro 


Asn 


Ser Ser 


He 


Val 


Tyr 






20 






His Thr 


Pro 


Gly Cys 


Val 


Pro 


Cys 






35 






Thr Cys 


Trp 


Thr Ser 


Val 


Thr 


Pro 






50 








Gly Ala 


Thr 


Thr Ala 


Ser 


He 


Arg 






65 






Gly Ala 


Ala 


Thr Met 


Cys 


Ser 


Ala 






80 






Gly Ala 


Val 


Phe Leu 


Val 


Gly 


Gin 






95 






Arg His 


Gin 


Thr Val 


Gin 


Thr 


Cys 






110 






His Leu 


Ser 


Gly His 


Arg 


Met 


Ala 






125 






Ser Pro 


Ala 


Val Gly 


Met 


Val 


Val 






140 








Gin Thr 


Leu 


Phe Asp 


He 


He 


Ala 






155 








Ala Gly 


Leu 


Ala Tyr 


Tyr 


Ser 


Met 






170 








Ala He 


lie 


Met Val 


Met 


Phe 


Ser 



10 15 

31u Ala Asp Asp Val He Leu 

25 30 
fal Gin Asp Gly Asn Thr Ser 

40 45 
Thr Val Ala Val Arg Tyr Val 

55 60 

5er His Val Asp Leu Leu Val 

70 75 

jeu Tyr Val Gly Asp Met Cys 

85 90 

Ha Phe Thr Phe Arg Pro Arg 

100 105 

isn Cys Ser Leu Tyr Pro Gly 

115 120 

'rp Asp Met Met Met Asn Trp 

25 _ _ . I 30 135 

ila His Val Leu Arg Leu Pro 
145 150 
\ly Ala His Trp Gly He Leu 
160 165 
fin Gly Asn Trp Ala Lys Val 
175 180 
My Val Asp Ala 

30 185 190 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 



Leu 


Glu 


Trp 


Arg 


Asn 
5 


Thr 


Ser 


Gly 


Leu 


Tyr 
10 


Val Leu Thr Asn Asp 

15 


Cys 


Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Asp Asp Val He Leu 


His 








20 










25 


30 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp Gly Asn Thr Ser 


Thr 








35 










40 


45 


Cys 


Trp 


Thr 


Pro 
50 


Val 


Thr 


Pro 


Thr 


Val 
55 


Ala Val Arg Tyr Val 

60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


He 


Arg 


Ser 


His 


Val Asp Leu Leu Val 










65 










70 


75 


Gly Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val Gly Asp Met Cys 










80 










85 


90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr Phe Arg Pro Arg 




His 






95 










100 


105 


Arg 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser Leu Tyr Pro Gly 


His 








110 










115 


120 


Leu 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met Met Met Asn Trp 

135 


Ser 


Pro 


Ala 


Val 


Gly 
140 


Met 


Val 


Val 


Ala 


His 
145 


Val Leu Arg Leu Pro 

150 


Gin 


Thr 


Val 


Phe 


Asp 


He 


He 


Ala 


Gly 


Ala 


His Trp Gly He Leu 


Ala 








155 










160 


165 


Gly 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Met 


Gin 


Gly 
175 


Asn Trp Ala Lys Val 

180 


Ala 


lie 


lie 


Met 


Val 


Met 


Phe 


Ser 


Gly 


Val 


Asp Ala 










185 








190 



(2) INFORMATION FOR SEQ ID NO: 89: 

25 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu Thr Asn Asp 

5 10 15 

~ Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp Val He Leu 
J 20 25 30 
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His Thr Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr Ser 

35 40 45 

Met Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr Val 

50 55 gg 

Gly Ala Thr Thr Ala Ser lie Arg Ser His Val Asp Leu Leu Val 

65 70 75 

Gly Ala Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Met Cys 
5 80 85 go 

Gly Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg 

95 100 105 

Arg His Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly 

110 115 120 

His Val Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 13 5 

Ser Pro Ala Val Gly Met Val Val Ala His He Leu Arg Leu Pro 
10 140 145 150 

Gin Thr Leu Phe Asp He Leu Ala Gly Ala His Trp Gly He Leu 

i55 160 165 

Ala Gly Leu Ala Tyr Tyr Ser Met Gin Gly Asn Trp Ala Lys Val 

170 175 180 

Ala He Val Met He Met Phe Ser Gly Val Asp Ala 

185 190 



15 



25 



(2) INFORMATION FOR SEQ ID NO: 90: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
20 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S54 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 



Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr He Leu Thr Asn Asp 

5 10 15 

Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp Val He Leu 

20 25 30 

His Thr Pro Gly Cys Val Pro Cys Val Gin Asp Gly Asn Thr Ser 

35 40 45 

Thr Cys Trp Thr Pro Val Thr Pro Thr Val Ala Val Arg Tyr Val 
30 5 0 5 5 60 

Gly Ala Thr Thr Ala Ser He Arg Ser His Val Asp Leu Leu Val 

65 70 75 

Gly Ala Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Met Cys 

80 85 90 

Gly Ala Val Phe Leu Val Gly Gin Ala Phe Thr Phe Arg Pro Arg 

95 100 105 

35 Arg His Gin Thr Val Gin Thr Cys Asn Cys Ser Leu Tyr Pro Gly 

110 115 120 
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His 


Leu 


Ser Gly His Arg Met Ala 


Trp 


ASp 


wet! 


Met Met 


V% 

Asn 


irp 






125 




130 








135 


Ser 


Pro 


Ala Val Gly Met Val Val 


Ala 


His 


lie 


Leu Arg 


Leu 


Pro 






140 




145 






150 


Gin 


Thr 


Leu Phe Asp lie Leu Ala 


Gly 


Ala 


His 


Trp Gly 


He 


Leu 






155 




160 








165 


Ala Gly Leu Ala Tyr Tyr Ser Met 


Gin 


Gly 


Asn 


Trp Ala 


Lys 


Val 






170 




175 








180 


Ala 


lie 


lie Met lie Met Phe Ser 


Gly 


Val 


Asp 


Ala 










185 


190 









(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 
(D } TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91 



Glu His 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


lie 


Tyr 


His 


He 


Thr Asn Asp 








5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


His 


His He Leu 








20 










25 




30 


His Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Met 


Thr 


Gly 


Asn Thr Ser 








35 










40 




45 


Arg Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Ala His Pro 








50 










55 






60 


Gly Ala 


Pro 


Leu 


Glu 


Ser 


Phe 


Arg 


Arg 


His 


Val 


Asp 


Leu Met Val 








65 










70 




75 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly Gly 


Ala 


Phe 


Leu 


Met 


Gly 


Gin 


Met 


He 


Thr 


Phe 


Arg Pro Arg 








95 










100 






105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Thr Gly 








110 










115 






120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Thr 


Thr 


Leu 


Leu 


Leu 


Ala 


Gin 


He 


Met 


Arg Val Pro 








140 










145 






150 


Thr Ala 


Phe 


Leu 


Asp 


Met 


Val 


Ala 


Gly Gly 


His 


Trp 


Gly Val Leu 








155 










160 






165 


Ala Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin Gly 


Asn 


Trp 


Ala Lys Val 








170 










175 






180 


Val Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly Val 


Asp 


Ala 










185 










190 







35 
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(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Val His Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

~ « 5 10 15 

Cys Pro Asn Thr Ser lie Val Tyr Glu Thr Glu His His He Met 

20 25 30 

His Leu Pro Gly Cys Val Pro Cys Val Arg Thr Glu Asn Thr Ser 

35 40 45 

Arg Cys Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Pro 

50 55 60 

15 Asn Ala Pro Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val 

65 70 75 

Gly Ala Ala Thr Met Cys Ser Ala Phe Tyr He Gly Asp Leu Cys 

80 85 90 

Gly Gly Val Phe Leu Val Gly Gin Leu Phe Asp Phe Arg Pro Arg 

95 100 105 

Arg His Trp Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 

110 115 120 

His Val Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Thr Ser Ala Leu He Met Ala Gin He Leu Arg He Pro 

140 145 150 

Ser He Leu Gly Asp Leu Leu Thr Gly Gly His Trp Gly Val Leu 

155 160 165 

Ala Gly Leu Ala Phe Phe Ser Met Gin Ser Asn Trp Ala Lys Val 

25 „ 170 175 180 

He Leu Val Leu Phe Leu Phe Ala Gly Val Glu Gly 

185 190 



20 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 



35 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z6 



PCT7US94/07320 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93 



Val 


Asn 


iyr 




A CJ 
AO II 


Ala 


G av> 


Vjriy 


val 


Tyr 

10 


TT 4 <-» 

HIS 


val Tnr Asn Asp 

15 


Cys 


Pro 


AD II 


OCX 


G a v 


lie 


val 


Tyr 


Glu 


Ala 


Glu 


His Gin He Leu 


His 






















30 


Leu 


JriO 




Cys 
35 


Leu 


Pro 


Cys 


Val 


Arg 
40 


Val 


Gly Asn Gin Ser 

45 


Arg 


Cys 


Trp 


vai 


Ala 
50 


Leu 


Thr 


Pro 


Thr 


Val 

55 


Ala 


Val Ser Tyr lie 

60 


Gly Ala 


Fro 


Leu 


Asp 


Ser 


Leu 


Arg 


Arg 


His 


Val 


Asp Leu Met Val 










65 










70 




75 


Gly Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp Leu Cys 










80 










85 




90 


Gly Gly 


Axa 


Pne 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Ser 


Phe Gin Pro Arg 




His 






95 










100 




105 


Arg 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He Tyr Ala Gly 


His 








110 










115 




120 


lie 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met Met Asn Trp 

135 


Ser 


Pro 


Thr 


Thr 


Thr 
140 


Leu 


Leu 


Leu 


Ala 


Gin 
145 


Val 


Met Arg He Pro 

150 


Ser 


Thr 


Leu 


Val 


Asp 
155 


Leu 


Leu 


Ala 


Gly 


Gly 
160 


His 


Trp Gly Val Leu 

165 


Val 


Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Ala 


Asn Trp Ala Lys Val 










170 










175 




180 


He 


Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Ala 










185 








190 







(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:94: 

Val Asn Tyr His Asn Ala Ser Gly Val Tyr His He Thr Asn Asp 

5 10 15 

30 Cys Pro Asn Ser Ser lie Met Tyr Glu Ala Glu His His He Leu 

20 25 30 

His Leu Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Gin Ser 

35 40 45 

Arg Cys Trp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr He 

50 55 60 

Gly Ala Pro Leu Glu Ser He Arg Arg His Val Asp Leu Met Val 

35 65 70 ?5 
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Gly Ala Ala Thr Val Cys Ser Ala Leu Tyr lie Gly Asp Leu Cys 

80 85 90 

Gly Gly Val Phe Leu Val Gly Gin Met Phe Ser Phe Gin Pro Arg 

95 100 105 

Arg His Trp Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Ala Gly 

HO 115 120 

His Val Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 
5 125 130 135 

Ser Pro Thr Thr Thr Leu Val Leu Ala Gin Val Met Arg He Pro 

140 145 150 

Ser Thr Leu Val Asp Leu Leu Thr Gly Gly His Trp Gly He Leu 

155 160 165 

lie Gly Val Ala Tyr Phe Cys Met Gin Ala Asn Trp Ala Lys Val 

170 175 180 

He Leu Val Leu Phe Leu Tyr Ala Gly Val Asp Ala 
10 185 190 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
15 (B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Tyr Asn Tyr Arg Asn Ser Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Thr Asp Tyr His He Leu 

20 25 30 

His Leu Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Lys Ser 

25 3 5 4 0 4 5 

Thr Cys Trp Val Ser Leu Thr Pro Thr Val Ala Ala Gin His Leu 

50 55 60 

Asn Ala Pro Leu Glu Ser Leu Arg Arg His Val Asp Leu Met Val 

65 70 75 

Gly Gly Ala Thr Leu Cys Ser Ala Leu Tyr He Gly Asp Val Cys 

80 85 90 

Gly Gly Val Phe Leu Val Gly Gin Leu Phe Thr Phe Gin Pro Arq 

30 95 100 ioI 

Arg His Trp Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Thr Gly 

11° 115 120 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Thr Ala Thr Leu Val Leu Ala Gin Leu Met Arg He Pro 

140 145 150 

35 Gly Ala Met Val Asp Leu Leu Ala Gly Gly His Trp Gly He Leu 

155 160 i 6 5 
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Val Gly He Ala Tyr Phe Ser Met Gin Ala Asn Trp Ala Lys Val 

170 175 180 

He Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 

185 190 

(2) INFORMATION FOR SEQ ID NO: 96: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
10 (C) INDIVIDUAL ISOLATE: SA1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Val Pro Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

10 15 
Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Ser Leu He Leu 

15 . 20 25 30 

His Ala Pro Gly Cys Val Pro Cys Val Arg Gin Asp Asn Val Ser 

35 40 45 

Arg Cys Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Thr Phe 

50 55 60 

Gly Ala Val Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala 

65 70 75 

Gly Gly Ala Ala Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys 
20 80 85 90 

Gly Ala Val Phe Leu Val Gly Gin Met Phe Thr Tyr Arg Pro Arg 

95 100 105 

Gin His Thr Thr Val Gin Asp Cys Asn Cys Ser He Tyr Ser Gly 

110 115 120 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

^ Ser Pro Thr Thr Ala Leu Leu Met Ala Gin Met Leu Arg He Pro 

140 145 150 

Gin Val Val He Asp He He Ala Gly Gly His Trp Gly Val Leu 

155 160 165 

Phe Ala Ala Ala Tyr Phe Ala Ser Ala Ala Asn Trp Ala Lys Val 

170 175 180 

Val Leu Val Leu Phe Leu Phe Ala Gly Val Asp Gly 

185 190 



30 



(2) INFORMATION FOR SEQ ID NO: 97: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

r 

Val Pro Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 
5 _ 5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asn Leu He Leu 

20 25 30 

His Ala Pro Gly Cys Val Pro Cys Val Arg Gin Asp Asn Val Ser 

35 40 45 

Lys Cys Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Asn Leu 

50 55 g0 

Gly Ala Val Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala 

10 65 70 75 

Gly Gly Ala Ala Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cvs 

80 85 90 

Gly Ala Val Phe Leu Val Gly Gin Met Phe Thr Tyr Arg Pro Arg 

95 100 105 

Gin His Thr Thr Val Gin Asp Cys Asn Cys Ser He Tyr Ser Gly 

110 115 120 

15 Ile Thr Glv His ^9 Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Thr Thr Ala Leu Leu Met Ala Gin Leu Leu Arg Ile Pro 

140 145 150 

Gin Val Val He Asp He He Ala Gly Gly His Trp Gly Val Leu 

155 - 160 " 165 

Phe Ala Ala Ala Tyr Phe Ala Ser Ala Ala Asn Trp Ala Lys Val 

170 175 180 

20 He Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

2c (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA5 



30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Val Pro Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

^ 5 10 15 

Cys Pro Asn Ser Ser Ile Val Tyr Glu Ala Asp Asn Leu Ile Leu 

20 25 30 

His Ala Pro Gly Cys Val Pro Cys Val Lys Glu Gly Asn Val Ser 

35 40 45 



WO 95/01442 



PCT/US94/07320 



- 116 - 

Arg Cys Trp Val Gin lie Thr Pro Thr Leu Ser Ala Pro Asn Leu 

50 55 60 

Gly Ala Val Thr Ala Pro Leu Arg Arg Val Val Asp Tyr Leu Ala 

65 70 75 

Gly Gly Ala Ala Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys 

80 85 90 

Gly Ala Val Phe Leu Val Gly Gin Met Phe Thr Tyr Arg Pro Arg 
5 95 100 105 

Gin His Thr Thr Val Gin 'Asp Cys Asn Cys Ser lie Tyr Ser Gly 

110 lis 120 

His lie Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Thr Thr Ala Leu Val Met Ala Gin Val Leu Arg lie Pro 

140 145 150 

Gin Val Val lie Asp lie lie Ala Gly Gly His Trp Gly Val Leu 
W 155 160 165 

Phe Ala Val Ala Tyr Phe Ala Ser Ala Ala Asn Trp Ala Lys Val 

170 175 180 

Val Leu Val Leu Phe Leu Phe Ala Gly Val Asp Gly 

185 190 

15 (2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA6 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 





Val Pro 


Tyr 


Arg Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 


25 






5 










10 






15 


Cys Pro 


Asn 


Ser Ser 
20 


He 


Val 


Tyr 


Glu 


Ala 
25 


Asp 


Asp 


Leu He Leu 

30 




His Ala 


Pro 


Gly Cys 


Val 


Pro 


Cys 


Val 


Arg 


Lys 


Asp 


Asn Val Ser 








35 










40 




45 




Arg Cys 


Trp 


Val His 
50 


lie 


Thr 


Pro 


Thr 


Leu 
55 


Ser 


Ala 


Pro Ser Leu 

60 




Gly Ala 


Val 


Thr Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr Leu Ala 


30 






65 










70 




75 




Gly Gly 


Ala 


Ala Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Val Cys 

90 




Gly Ala 


Leu 


Phe Leu 
95 


Val 


Gly 


Gin 


Met 


Phe 
100 


Thr 


Tyr 


Arg Pro Arg 
105 




Gin His 


Ala 


Thr Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


lie 


Tyr Ser Gly 
120 


35 


His He 


Thr 


Gly His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 










130 






135 
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Ser Pro Ala 
Gin Val Val 
Phe Ala Ala 
Val Leu Val 



Thr Ala Leu 

140 
lie Asp lie 

155 
Ala Tyr Phe 

170 
Leu Phe Leu 

185 
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Val Met Ala 
He Ala Gly 
Ala Ser Ala 
Phe Ala Gly 



Gin Met Leu 
145 

Gly His Trp 
160 

Ala Asn Trp 
175 

Val Asp Ala 
190 



Arg He Pro 
150 

Gly Val Leu 
165 

Ala Lys Val 
180 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA7 

15 



20 



25 



30 



(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 100: 




Val Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val Thr Asn 


Asp 


Cys Pro 






5 
















15 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asn Leu He 


Leu 


His Ala 






20 










25 




30 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Gin 


Asn Asn Val 


Ser 


Arg Cys 






35 










40 






45 


Trp 


Val 


Gin 


lie 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala Pro Asn 


Leu 


Gly Ala 






50 










55 






60 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp Tyr Leu 


Ala 


Gly Gly 






65 










70 




75 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp Ala 


Cys 


Gly Ala 






80 










85 




90 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Ser 


Tyr Arg Pro 


Arg 


Gin His 






95 










100 




105 


Thr 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He Tyr Ser 


Gly 


His lie 






110 










115 




120 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met Asn 


Trp 


Ser Pro 






125 










130 






135 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu Arg He 


Pro 


Gin Val 






140 










145 




150 


Val 


He 


Asp 


lie 


He 


Ala 


Gly 


Gly 


His 


Trp Gly Val 


Leu 


Phe Ala 






155 










160 




165 


Ala 


Ala 


Tyr 


Phe 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp Ala Lys 


Val 


Val Leu 






170 










175 




180 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 








185 








190 







(2) INFORMATION FOR SEQ ID NO: 101 
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25 



- 118 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 



(xi) 


SEQUENCE 


DESCRIPTION 


Val Pro 


Tvr 


y 


JnOLL 


AXd 


G A T* 

06X7 




Val 


Cvs Pro 


A cm 


OCX. 


•j 

Cor 

OCX. 


Tl a 


vclj. 


ryr 










20 








His Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 








35 








Arg Cys 


Trp 


Val 


Gin 


lie 


Thr 


Pro 


Thr 








50 










Gly Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 








65 






Gly Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 








80 








Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 








95 








Arg His 


Asn 


Val 


Val 


Gin 


Asp 


Cys 


Asn 








110 






His lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 








125 








Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 








140 










Gin Val 


Val 


lie 


Asp 


He 


lie 


Ala 


Gly 








155 








Phe Ala 


Ala 


Ala 


Tyr 


Tyr 


Ala 


Ser 


Ala 








170 










Val Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 



10 15 
Ala Asp Asp Leu He Leu 

25 30 
Arg Gin Gly Asn Val Ser 

40 45 
Leu Ser Ala Pro Ser Leu 

55 60 
Ala Val Asp Tyr Leu Ala 

70 75 
Tyr Val Gly Asp Ala Cys 

85 90 
Phe Thr Tyr Ser Pro Arg 
100 105 
Cys Ser He Tyr Ser Gly 
115 120 
Asp Met Met Met Asn Trp 
130 135 
Gin Leu Leu Arg He Pro 
145 150 
Ala His Trp Gly Val Leu 
160 165 
Ala Asn Trp Ala Lys Val 
175 180 



185 190 



(2) INFORMATION FOR SEQ ID NO: 102: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(vi) ORIGINAL SODRCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK2 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102 
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10 



15 



- 119 - 



Leu 


mr 


Tyr Gin 


Asn 


Ser 


Ser Gin 


Leu 


Tyr His 


Leu 


Thr Asn Asp 


Cys 






i 








10 




15 


F3TO 


Asn ser 


Ser 


lie 


Val Leu 


Glu 


Ala Asp Ala 


Met He Leu 


His 














25 




30 


Leu 


Fro Varin 


Cys 


Leu 


Pro Cys 


Val 


Arg Val Asp 


Asp Arg Ser 


Thr 




irp His 


Jo 








40 




45 


uys 


Ala 


Val 


Thr Pro 


Thr 


Leu Ala 


lie 


Pro Asn Ala 




1X137 












55 




60 


Ser 


Fro Ala. 


Tnr 


Gin 


Phe Arg Arg 


His Val 


Asp 


Leu Leu Ala 


Gin 






65 








70 


75 


Ala 


Aia. vai 


val 


Cys 


Ser Ser 


Leu 


Tyr He 


Gin 


Asp Leu Cys 


V3J.I1 






80 








85 




90 


Ser 


Leu Pne 


Leu 


Ala 


Gin Gin 


Leu 


Phe Thr 


Phe 


Gin Pro Arg 


Arg 


HIS 




95 








100 




105 


^^^^ ■ ^^^^ft _ 

Trp Thr 


Val 


Gin Asp Cys Asn 


Cys Ser 


He 


Tyr Thr Gin 


His 






110 








115 




120 


Val 


Thr Gin 


His 


Arg Met Ala 


Trp 


Asp Met 


Met 


Met Asn Trp 


Ser 






125 








130 




135 


Pro 


Thr Thr 


Thr 


Leu 


Val Leu 


Ser 


Ser He 


Leu 


Arg Val Pro 


Glu 


lie 




140 








145 




150 


Cys Ala 


Ser 


Val 


He Phe 


Gin 


Gin His 


Trp 


Gin He Leu 




Ala 




155 








160 


165 


Leu 


Val Ala 


Tyr 


Phe 


Gin Met 


Ala 


Gin Asn 


Trp 


Leu Lys Val 








170 








175 


180 


Leu 


Ala 


Val Leu 


Phe 
185 


Leu 


Phe Ala 


Gin 


Val Glu 
190 


Ala 





(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS : 

20 (A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

25 GCGTCCGGGT TCTGGAAGAC GGCGTGAACT ATGCAACAGG 40 

(2) INFORMATION FOR SEQ ID NO: 104: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

AGGCTTTCAT TGCAGTTCAA GGCCGTGCTA TTGATGTGCC 40 



(2) INFORMATION FOR SEQ ID NO: 105 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105 

5 

AAGACGGCGT GAACTATGCA ACAGGGAACC TTCCTGGTTG 

(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106 

AGTTCAAGGC CGTGCTATTG ATGTGCCAAC TGCCGTTGGT 



15 



(2) INFORMATION FOR SEQ ID NO: 107 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107 

AAGACGGCGT GAATTCTGCA ACAGGGAACC TTCCTGGTTG 



25 (2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

AGTTCAAGGC CGTGGAATTC ATGTGCCAAC TGCCGTTGGT 



(2) INFORMATION FOR SEQ ID NO: 109: 



35 



(i) 



SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 42 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

ARCTYCGACG TYACATCGAY CTGCTYGTYG GRAGYGCCAC CC 

5 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
0 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

RCARGCCRTC TTGGAYATGA TCGCTGGWGC Y 



15 (2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

CRATACGACR YCAYGTCGAY TTGCTCGTTG GGGCGGCTRY YT 



(2) INFORMATION FOR SEQ ID NO: 112: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

30 RCAAGCTRTC RTGGAYRTGG TRRCRGGRGC C 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

TTGCGGACKC ACATYGACAT GGTYGTGATG TCCGCCACGC 40 

5 (2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 114: 

GATGCGCGTT CCCGAGGTCA TCWTAGACAT CRTYRGCGGR GCD 43 

(2) INFORMATION FOR SEQ ID NO: 115: 

1<5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

20 AATGGCACCY TGCRCTGCTG GATACAAGTR ACACCTAATG TGGCTGTGAA 50 
ACAC 54 

(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

2« (A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

ARCTAGYC CTYSARGTYG TCTTCGGYGG Y 31 



30 



(2) INFORMATION FOR SEQ ID NO: 117 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 
?<r (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

GCCAACGTCT CTCGATGTTG GGTGCCGGTT GCCCCCAATC TCGCCATAAG 
TCAA 

(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

AAGGGCCTGC GAGCACACAT CGATATCATC GTGATGTCTG CTACGG 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

15 (A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

TTGGTGCGCA TCCCGGAAGT CATCTTGGAT ATTGTTACAG GAGGT 



10 



20 



(2) INFORMATION FOR SEQ ID NO: 120: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 
2< (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

AGTCAGGTAY GTCGGAGCAA CCACCGCYTC GATACGCAGT 

30 (2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121 
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AGCCTTCACG TTCAGACCKC GTCGCCATCA AACRGTCCAG ACCTGT 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

TCCCCCGCYG TGGGTATGGT GGTRGCGCAC RTYCTGCGDY TGCCCCAGAC 
CKTGTTYGAC ATAMTRGCYG GGGCC 

10 

(2) INFORMATION FOR SEQ ID NO: 123: 

SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 39 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

ACGCCGGTGA CGCCTACAGT GGCTGTCGCA CAGCCGGGC 



(i) 



15 



20 (2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

ATGAGGGTCC CCACAGCCTT TCTCGACATG GTTGCCGGAG GC 



(2) INFORMATION FOR SEQ ID NO: 125: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

35 CGCGCCCTAT CCCAACGCAC CGTTAGAGTC CATGCGCAGG 
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(2) INFORMATION FOR SEQ ID NO: 12 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 <D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

TCAGATCTTA CGGATCCCCT CTATCCTAGG TGACTTGCTC ACCGGGGGT 



(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 < xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

CAGTCACGCT GCTGGGTGGC CCTTACTCCC ACCGTGGCGG YGYCTTATAT 
CGGT 



(2) INFORMATION FOR SEQ ID NO: 12 8: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

{ C) STRANDEDNESS : single 
(D) TOPOLOGY: linear 



25 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 8: 

TAGCACTCTG GTRGAYCTAC TCRCTGGAGG G 



(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

AAGTCTACAT GCTGGGTGTC TCTCACCCCC ACCGTGGCTG CGCAACATCT 
GAAT 
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(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

AGGCGCCATG GTCGACCTGC TTGCAGGCGG C 



(2) INFORMATION FOR SEQ ID NO: 131: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

15 TCAGCCCCGA VYYTCGGAGC GGTCACGGCT CCTCTTCGGA GGG 



(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132 

TGYTACGGAT YCCCCARGTG GTCATHGACA TCATWGCCGG GGSC 



25 



(2) INFORMATION FOR SEQ ID NO: 133: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133 

CATACCAAAT GCTTCCACGC CCGCAACGGG ATTCCGCAGG 



(2) INFORMATION FOR SEQ ID NO: 134 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

5 

TCTTCTTGCG GGCGCCGCAG TGGTTTGCTC ATCCCTG 37 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 

ATCTAGCATC TTGAGGGTAC CTGAGATTTG TGCGAGTGTG ATATTTGGTG 50 
15 GC 52 

(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 6: 

Trp lie Gin Val Thr Pro Asn Val Ala Val Lys His Arg Gly Ala 

5 10 15 

7 . Leu Thr His Asn Leu Arg Xaa His Xaa Asp Xaa He Val Met Ala 

a 20 25 30 

Ala Thr Val 



(2) INFORMATION FOR SEQ ID NO: 13 7: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

3 - Trp Val Pro Val Ala Pro Asn Leu Ala He Ser Gin Pro Gly Ala 

5 10 15 
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Leu Thr Lys Gly Leu Arg Ala His lie Asp lie lie Val Met Ser 

20 25 30 

Ala Thr Val 



10 



(2) INFORMATION FOR SEQ ID NO: 13 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 8: 

Tip He Pro Val Xaa Pro Asn Val Ala Val Xaa Xaa Pro Gly Ala 

5 10 15 

Leu Thr Gin Gly Leu Arg Thr His He Asp Met Val Val Met Ser 

20 25 30 

Ala Thr Leu 



15 (2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Trp Thr Xaa Val Thr Pro Thr Val Ala Val Arg Tyr Val Gly Ala 

5 10 15 

Thr Thr Ala Ser He Arg Ser His Val Asp Leu Leu Val Gly Ala 

20 25 30 

Ala Thr Xaa 



25 



(2) INFORMATION FOR SEQ ID NO: 140: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
30 (D) TOPOLOGY: unknown 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

Trp Val Ala Leu Xaa Pro Thr Leu Ala Ala Arg Asn Xaa Xaa Xaa 

5 10 15 

Xaa Thr Xaa Xaa He Arg Xaa His Val Asp Leu Leu Val Gly Ala 

35 20 25 30 

Ala Xaa Phe 
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(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 
5 (D) TOPOLOGY: unknown 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

Trp Val Xaa Xaa Xaa Pro Thr Val Ala Thr Arg Asp Gly Lys Leu 

5 10 15 

Pro Xaa Xaa Gin Leu Arg Arg Xaa lie Asp Leu Leu Val Gly Ser 

20 25 30 

10 Ala Thr Leu 

(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

15 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

Trp Thr Pro Val Thr Pro Thr Val Ala Val Ala His Pro Gly Ala 

5 10 15 

Pro Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val Gly Ala 

20 20 25 30 
Ala Thr Leu 



(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

25 (A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

Trp Val Ala Leu Thr Pro Thr Val Ala Xaa Xaa Tyr He Gly Ala 

30 5 10 15 

Pro Leu Xaa Ser Xaa Arg Arg His Val Asp Leu Met Val Gly Ala 

20 25 30 

Ala Thr Val 



35 



(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Trp Val Ser Leu Thr Pro Thr Val Ala Ala Gin His Leu Asn Ala 

5 10 15 

Pro Leu Glu Ser Leu Arg Arg His Val Asp Leu Met Val Gly Gly 

20 25 30 

Ala Thr Leu 



(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Pro Asn Ala 

5 10 15 

Pro Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Ala 

20 25 30 

Ala Thr Met 



10 



20 



(2) INFORMATION FOR SEQ ID NO: 146: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 
25 (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Trp Val Xaa lie Thr Pro Thr Leu Ser Ala Pro Xaa Xaa Gly Ala 

5 10 15 

Val Thr Ala Pro Leu Arg Arg Xaa Val Asp Tyr Leu Ala Gly Gly 

20 25 30 

30 Ala Ala Leu 

(2) INFORMATION FOR SEQ ID NO: 14 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

3 5 (C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 
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° (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 7: 

Trp His Ala Val Thr Pro Thr Leu Ala lie Pro Asn Ala Ser Thr 

5 10 15 

Pro Ala Thr Gly Phe Arg Arg His Val Asp Leu Leu Ala Gly Ala 

20 25 30 

Ala Val Val 

5 

(2) INFORMATION FOR SEQ ID NO: 14 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
10 (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:148: 

Thr Leu Thr Met lie Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu 

5 10 15 

Xaa Leu Xaa Val Val Phe Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 149: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Thr Thr Thr Met Leu Leu Ala Tyr Leu Val Arg lie Pro Glu Val 

5 10 15 

lie Leu Asp lie Val Thr Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 150: 



25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Thr Xaa Thr Xaa lie Leu Ala Tyr Xaa Met Arg Val Pro Glu Val 

5 10 15 

He Xaa Asp He Xaa Xaa Gly Ala 
*° 20 
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(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

Ala Val Gly Met Val Val Ala His Xaa Leu Arg Leu Pro Gin Thr 

5 10 15 

Xaa Phe Asp lie Xaa Ala Gly Ala 

20 



(2) INFORMATION FOR SEQ ID NO: 152 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 
15 (D) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

Thr Xaa Ala Leu Val Xaa Ser Gin Leu Leu Arg Xaa Pro Gin Ala 

5 10 15 

Xaa Xaa Asp Xaa Val Xaa Gly Ala 

20 



(2) INFORMATION FOR SEQ ID NO: 153 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

~ s ( C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Thr Xaa Ala Leu Val Xaa Ala Gin Leu Leu Arg Xaa Pro Gin Ala 

5 10 is 

Xaa Leu Asp Met lie Ala Gly Ala 
30 20 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

{ C ) STRANDEDNESS : unknown 
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(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Thr Thr Thr Leu Leu Leu Ala Gin He Met Arg Val Pro Thr Ala 

5 10 15 

Phe Leu Asp Met Val Ala Gly Gly 
5 20 

(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

10 (C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Thr Thr Thr Leu Xaa Leu Ala Gin Val Met Arg He Pro Ser Thr 

5 10 15 

ls Leu Val Asp Leu Leu Xaa Gly Gly 
3 20 

(2) INFORMATION FOR SEQ ID NO: 15 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Thr Ala Thr Leu Val Leu Ala Gin Leu Met Arg He Pro Gly Ala 

5 10 15 

Met Val Asp Leu Leu Ala Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 15 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 157: 



35 



Thr Ser Ala Leu He Met Ala Gin He Leu Arg He Pro Ser He 

5 10 15 
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Leu Gly Asp Leu Leu Thr Gly Gly 

20 



(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

5 (A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

Xaa Thr Ala Leu Xaa Met Ala Gin Xaa Leu Arg lie Pro Gin Val 
10 5 10 15 

Val lie Asp He He Ala Gly Xaa 

20 

(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

«f (A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

Thr Thr Thr Leu Val Leu Ser Ser He Leu Arg Val Pro Glu He 
20 5 10 15 

Cys Ala Ser Val He Phe Gly Gly 

20 



25 



30 



35 
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CLAIMS 

1 . A cDNA of the envelope 1 gene of the 
hepatitis C virus wherein the cDNA has a sequence selected 
from the group consisting of SEQ ID N0:1 through SEQ ID 
NO:51. 

2. A recombinant hepatitis C virus envelope 1 
protein encoded by a gene whose sequence includes a. 
sequence selected from the group consisting of SEQ ID N0:1 
through SEQ ID NO: 51. 

3 . A recombinant protein having an amino acid 
sequence selected from the group consisting of SEQ ID NO: 52 
through SEQ ID NO: 102. 

4. A method for the recombinant DNA- directed 
synthesis of at least one complete envelope 1 protein of 
hepatitis C virus said method comprising: 

culturing a transformed or transfected host 
organism containing a DNA sequence capable 
of directing the host organism to produce an 
envelope 1 protein under conditions such 
that the protein is produced, said protein 
exhibiting substantial homology to a protein 
comprising the amino acid sequence selected 
from the group consisting of SEQ ID NO: 52 
through SEQ ID NO: 102. 

5. The method of claim 4, wherein the host 
organism is transfected with a recombinant eukaryotic 
expression vector. 

6. The method of claim 4, wherein the 
eukaryotic vector is a baculovirus vector. 



35 



7. The method of claim 4, wherein the host 
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organism is a eukaryotic cell, 

8. The method of claim 7 , wherein the 
eukaryotic cell is an insect cell. 

5 9, A recombinant expression vector comprising a 

cDNA sequence selected from the group consisting of SEQ ID 
NO:l through SEQ ID NO: 51. 



10 



10. A host organism transformed or transfected 
with a recombinant expression vector according to claim 9. 



11. A method of detecting antibodies to HCV in a 
biological sample suspected of containing said antibodies 
comprising: 

15 (a) contacting the sample with at least one 

recombinant protein of claim 3 to form 
an immune complex with the antibodies; 
and 

(b) detecting the presence of the immune 
20 complex . 

12. The method of claim 11 wherein the 
biological sample is selected from the group consisting of 
serum, saliva or lymphocytes or other mononuclear cells. 

25 

13. The method of claim 11, wherein the 
recombinant envelope 1 protein is bound to a solid support. 

14. The method of claim 11, wherein the immune 
30 complex is detected using a labeled antibody. 

15. A hepatitis C virus hit comprising: at least 
one recombinant protein comprising an amino acid sequence 
selected from the group consisting of: SEQ ID NO: 52 through 

35 SEQ ID NO: 102. 
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16. A pharmaceutical composition comprising at 
least one recombinant protein of claim 3 and a suitable 
excipient, diluent or carrier. 

17. A method of preventing hepatitis C 

5 infection, comprising administering the pharmaceutical 

composition of claim 16 to a mammal in an effective amount 
to stimulate the production of protective antibody. 

18. A vaccine for immunizing a mammal against 

10 hepatitis C infection, corrprising at least one recombinant 

protein according to claim 3 in a pharmacologically 
acceptable carrier. 

19. A method for detecting the presence of the 
15 hepatitis C virus via a reverse transcription- polymerase 

chain reaction process, wherein the primers are selected 
from the sequences shown in SEQ ID NO: 103 through in SEQ ID 
NO: 108. 



20 20. Substantially isolated and purified primers, 

wherein said primers have nucleic acid sequences selected 
from the group consisting of SEQ ID NO: 103 through SEQ ID 
NO:108. 



25 
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21. A diagnostic kit for use in detecting the 
presence of hepatitis C virus, said kit comprising: primers 
having nucleic acid sequences selected from the group 
consisting of SEQ ID NO: 103 through SEQ ID NO: 108. 

22. A method for determining the genotype of a 
hepatitis C virus, said method comprising: 

(a) amplifying RNA via reverse 
transcription-polymerase chain reaction 
to produce amplification products; 

(b) contacting said products with at least 
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one genotype -specific oligonucleotide; 
and 

(c) detecting complexes of said products 
which bind to said oligonucleotide (s) . 

5 23. The method of claim 22, wherein said 

amplif ication of step (a) uses primer having a sequence 
according to SEQ ID N0:103 through SEQ ID NO: 108. 

24. The method of claim 23, wherein said 

10 oligonucleotide of the step (b) is a nucleic acid sequence 
selected from the group consisting of SEQ ID NO: 109 through 
SEQ ID NO: 135. 

25. Substantially isolated and purified 
15 oligonucleotides, wherein said oligonucleotides have 

nucleic acid sequences selected from the group consisting 
of SEQ ID NO: 109 through SEQ ID NO: 135. 

26. A diagnostic kit for determining the 
20 genotype of a hepatitis C virus, said kit comprising 

primers selected from the group consisting of SEQ ID NO: 103 
through SEQ ID NO: 108 and hybridization probes selected 
from the group consisting of SEQ ID NO: 109 through SEQ ID 

NO:135. 

25 

27. A substantially purified and isolated 
peptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 136 through SEQ ID NO: 159. 

30 28. A method of detecting antibodies specific 

for a single genotype of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one peptide of claim 27 to form 
an immune complex with the antibodies, 
35 and 
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(b) detecting the presence of the immune 
complex. 

29. The method of claim 28, wherein the 
biological sample is selected from the group consisting of 
serum, saliva or lymphocytes or other mononuclear cells. 

30. The method of claim 28, wherein said peptide 
is bound to a solid support. 

31. The method of claim 28, wherein the immune 
complex is detected using a labelled antibody. 



32. A kit for use in detecting hepatitis C virus 
antibodies, said kit comprising: at least one peptide 

15 selected from the group consisting of SEQ ID NO: 136 through 
SEQ ID NO: 159. 

33. A pharmaceutical composition comprising at 
least one peptide of claim 27 and a suitable excipient, 

20 diluent or carrier. 

34. A method of preventing hepatitis C 
infection, comprising administering the pharmaceutical 
composition of claim 33 to a mammal in an effective amount 
to stimulate production of a protective antibody. 



35 . A vaccine for immunizing a mammal against 
hepatitis C infection, comprising at least one peptide 
according to claim 27 in a pharmaceutical ly acceptable 
30 carrier . 
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FIGURE 1A 
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428 TaTCGCAGTTACTCCGaATCCCACAAGCTGTCgTGGACATGGTC 
428 TgTCGCAGTTACTCCGGATCCCACAAGCTGTCaTGGACAT^ 

428 TATCCKziJrre^^ 



428 TATC&CAGTTACTCCCX&TCCCACAAGCTGTCa 

I MM IIIMIMIM Milt 1 1 1 ! II 1 1 1 llllllilll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

428 TCTCGCAGlTACrcCGGATCCCGCAAGCTCTCXnYxGACATG 



428 TCTCGCAGTTACTCCGGATCCCGCAAGCTaTCG^ 

ill mi iiiiiiiiiiiiiiiiniii iiiiiiiuiuiiiiiiniiiniiu 

428 T(nCGCAaTTACTCCGGATCCC<K3AGCra 



mini i 



llllllll 



I 



428 TGTCGCAGTTgCTCCGGATCCCACAAGCTGTCGTGGACATGGTC 
I I 1 I J I 1 

428 TATCGCAGTTaCKXXjGATCCCACAAGCTGTCGTGG 



I 



nil 



428 TATCXXIAGTTGCTCCGGATCCCACAAG 



428 TATCGCAffTTGCTCCGGATCCCACAAGCrorcGT^ 

iiiiiiiiii iiiiiiiiiiiiiiiiii mini 1 1 1 1 ii i m m 1 1 m 1 1 1 1 1 1 

428 TATCGCAGTTACTCCGGATCCCACAAGCTaTCG^ 



426 TATCXSCAGTTACTCCGGATCCCACJ^^ 

428 TATCGCMTTACTCCGGATCCCACAAGCTGTCAT^ 



428 TATCGCAGTTACTCCXraCTCCCACAAGCTCT 



I 



I 



I 11 I I I I 



ii mi iiiiiiiiii 



428 TgTCGCAGCCACTCCG&ATCCCACAA^ 



II illll 



428 TaTCGCAGCTACTCOCKaATCCXACAAGCTgTC 

TaTCGCAgtTaCTCCGgaTCCCaCAAGCTgTCgTGGAcaT^ 
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FIGURE IB 



SEO ID NO: 


Isolate 




XI. 
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489 
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t«ra 


to? 


3^ 
«J 










iQQ 


17 


Tun o 


489 


i a 
lo 


Time 


it OA 

489 


21 


SA10 


489 


20 


S45 


489 


25 


US6 


489 


13 


HK4 


489 


18 


P10 


489 


19 


S9 


489 


9-25 


consensus 





Mill 



iiiiii 



489 GGTCCXX3GCX3GGCCTCGCCTACTATTCCAT 
489 GGTCCTGGCGGC&CTTGCCTACT^ 
489 iUmxrniCKXKKSCCTTGCX^ 



IIIIII IIIIIIIIIIillllllMIIIIIIIIII i I ! 1 1 1 1 i 1 1 { ] 1 1 1 ] 1 1 1 1 f 1 1 1 



489 AGTCCITXXX^CCTiraaTACTATTC^^ 
489 AATCCTGGCGGGCCTTGCClTlCTA^ 
&ATCCT 

i mi 



489 AGTCCTGGCGGGCCri\i<^auZTATTCC^^ 



mill iiiiiiiiin 

AGTCCTaGCGGGCCTTGC 

nun 



iilllin 

TTGATTGTG 

lllllllll 
TTGATTGTG 

1 1 1 1 1 1 1 II 



ag^CCTgGOGGGCCTtGCc^ACTIltTCCATGGtgC^gAACTGGGQJUkGGTttTSATTGTg 
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FIGURE IB 



SEO ID NO: 


Isolate 


XX 


urwx 






10 


D3 


9 


Dl 


14 


HK5 


15 


HK8 


12 


niw 




1 J 






17 


XND8 


it 


AND 5 


21 


SA10 


20 


S45 


25 


US6 


13 


HK4 


18 


P10 


19 


S3 


9-25 


cons ensue 



5S0 tTGCTACTCTTTGCCGGCGTTGATGCG 

! 1 1 1 1 1 1 1 S I ! 1 1 !! 1 1 i 1 1 1 1 1 1 1 1 

550 ATGCTACTCTTTGCCGGCGTTGATGGG 

IIII1MMIMII Mill || || 

550 ATGCTACTCTTTGCTGGCGTcGACGGC 

iiiiiiiiiiiiniiiiii mum 

550 ATGCTACTCTTTGCTGGCGTTGACGGC 

llllllll Hill I I I I I I t I || 

550 ATGCTACTCTTTGCCGGCGTTGATGGG 

llllllll llllllll llllllilll 
550 ATGCTACTgTTreCCGGCGTTGATGGG 

llllllll IIIIIIIMIIIIIIMI 
550 ATGCTACTtTTTGCCGGCGTTGATGGG 

lllllll IMIIIIIIMIIIIMI 

550 CTGCTACTCTTTGCCGGCGTTGA1X3GG 

IIMIIIIMIII llllllll III 

550 ATGCTACTCTTTGCtGGCGTTGACGGG 

nilll lllllll l llllllllllll 
550 ATGCTACTCTTTCCCGGCGTTGACGGG 

lllllllllllilllllllllllllll 
550 ATGCTACTCTTTGCCGGCGTTGACGGG 

lllllllllllilllllllllllllll 
550 ATGCTACTCTTTGCCGGCGTTGACGGG 

llllllilll lllllllllllllllll 
550 ATGCTACTCTTTGCCGGCGTTGACGGG 

MMII III lllllll llllllilll 
550 tTGCTACTCTTTGCCGGCGTTGACGGG 

III II MM II lllllllllllllll 
550 ATGCTACTCTTTGCCGGCGTTGACGGG 

1 1 m 1 1 1 1 m 1 1 1 i f t r 1 1 1 1 1 1 1 1 1 

550 ATGCTACTCTTTGCCGG C GTTGACGGa 

llllllll Mill II llllllll 
550 ATGCTACTtTTTCCtGGtGTTGACGGg 

aTGCTACTcTZTGCcGGcGTtGAcGGg 
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FIGURE 1C 



SEP ID NO: Isolate 
26 T2 



27 
28 
29 
26-29 



T4 
T9 
TJS10 
consensus 



GCcCAAGTGAgGAACACCAgc cgCgG tTACTlTGGTGAC tAACGACXXSTTCcAATGAgAGCA 

ii iiinii iiiniii i i iiiiiiiiiii iiiiiiiiiii inn mi 

GCaCAAGTGAAGAACACCAcTAaCAGCTACATGG^^ 

II MINI llflllill II 1 1 M 1 11 1 1 1 II 1 1 1 II II II I M i II IIINII 

GCCgAAGTG/J^GAACACCAGTACCAGC^^ 

i i mini iimmmiiim miiiii iiiniii mimiimi 

GtCcJUUSTGAAaAAOtCOCTA^ 

GcccAAGTGAagAACACCAgtacCaGCTAcATGGTGACcAA 



SEP ID BO: 
26 

27 

28 

29 

26-29 



Isolate 
T2 

T4 

T9 

US10 

consensus 



62 TCACcTGGCAGCTCCAaGCCGCGGTtCTCCACGTC 

mi iiiiiiiiiii 1 1 1 1 i 1 1 i iinniiiiiiiiiin miiiii mi 

62 TCACtTGGCAGCTCCAGGCCGCGGTCCTCCACGTCCCCXKSGTGTGTC 



mi rim mimimmmmii 



62 TCACCTXKX!AACTCCAGGCOGCXjGTCCTCCAO^ 



llll IIINIII III) 1 1 1 i 1 ! I ! 1 1 1 1 1 



f 1 1 1 1 1 1 1 1 mm 



1 1 1 1 i 1 1 1 mm 



mm i 



mm m 



62 TCACtTGGCAACTtgAGGCtGCGGTCCTCCACGTtCCCGGGT 

TCAC -TGGCA- CTccAgGCcGCGGTcCTCCACGTcCCCGGGTGtgTCCC^ agt 



SEP ID NO 
26 

27 

28 

29 

26-29 



Isolate 
T2 

T4 

TO 

US 10 

consensus 



123 GGGAAATACATC cCGaTGCTGGATACCGGTcaCACCAAACGTGGCCGTGCGGCAG^ 

i 1 1 1 1 1 1 1 1 1 1 1 n iiiiiNiiiiiii iiiimimiiimiimimm 

123 GGGAAATACATCtCXKTTGCTGGATACCGGTtTCACCAAAOGTGGCCGTC 

inn i n immmmim n iiiniii ii mm ii in in 

123 tGGAAAcgCgTCgCGGTGCTGGATACCGGTCTCgCCAAA^ 

inn i ii mmimiimiim inn ii ii immiimiii 

123 gGGAAAtaCaTCtCGGTGCTGGATACCGGTCTCaCCAAAtGTgGC 

gGGAAAtaC^TCtCGgTGCTGGATACCGGTctCaCCAAAcGTg^cGTt^ - GC - GCC - GGC 



SEP ID KO: Isolate 
26 T2 



27 
28 
29 
26-29 



T4 
TO 
US10 
consensus 



184 



184 



GCtCl^ACGattSGGCTTGCGGAOXA 

n n imiiimimimimi mmmiimiimmimimi 

GCCCI^CGCAGGGCTIXKX^CGCACATtGACA 



1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 



M i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 iiii 



1 1 1 1 ! 1 1 1 1 ! 



184 GCCCTCACGCAGGGCTTGCGGAOSCACATCGACATGG^ 



I M 1 1 1 1 1 1 I 



II 



II 



IIIIIIIIIII 



IIIIIIIIIII 



inn 



inn 



184 GCCCICACGCAGGGCTTGOGGACtGACATCGAA 

GCcCTcACGCAGGGCTTGCGGACgCACATcGACATGGT^ 



SEP ID HO 
26 

27 

28 

29 



Isolate 
T2 

T4 

TO 
US10 



245 ClXTCcCTcTAantraSGGACCTCTGCGGO^^ 

llll II 1 1 1 1 1 i 1 i 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 JMI 1 1 1 II 

245 CTGCTCTtTACGTGGGGGACCTKrrGC&GOGG 

i inn mnmm imimmm iiiniii ii iiiiiiiiiii i 

245 CCGCTCTcTACGTGGGGGAtCTCTGCGGCGGGGTaATGCTC 

mini iiiiiiiiiii miiiii ni i iiiniii ii ii iniinii i 

245 CCGCTCTtTACGTGGGGGActTCTGCGGtGGGaTgATGCTCGCaGCcCAaATC 



26-29 



consensus 



C - GCtCT -TACGTGGGGGAccTCTGCGGcGGGgTgATGCTCGCaGCcC^ 
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FIGURE 1C 



SEP IP NO: 
26 

27 

28 

29 

26*29 



Isolate 
T2 

T4 

T9 

US10 

consensus 



306 CTCGCCGCgACgcCACTGGTTTCTGCAAGA^ 

1 1 1 ! 1 1 1 1 II II II I Ml II III III I lllllllllll lllillll II 1 1 1 1 1 1 

306 CTCGCCGC&ACAtCACTGGTITGTGCAA 

1 1 1 1 1 1 1 1 1 ii iiiiiiiiiiiin ii inn inn ii iiiiini linn 

306 CTCGCCXjCAgCACCACTGGTITGTGCAGGAATGC^ 

iiiiini liiiin 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 inn 1 1 1 1 1 1 1 1 1 

306 CTCGCCGCgcCACC^VCTcGTTTGTGCJlGGAATGC^ 

CTC6C0GC - aCacCACTgGTTTGTGCA- GAaTGCAA-TGCTCcATCTACCC - GG tACCATC 



SEP ID NO: 
26 

27 

28 

29 

26-29 



Isolate 
T2 

T4 
TO 

nsio 

consensus 



367 JUnXXHlCACCCTATGGaH^ 

M I It 1 1 1 1 M 1 1 1 1 1 1 1 M I! I 1 1 1 i 1 1 1 M 1 1 1 1 1 1 1 1 ! 1 1 1 i I 1 1 1 1 1 1 1 1 1 1 1 1 1 

367 ACTGGACACCXnTlTGGCATGGGA 

iiimiimmimmii miiiimimmmm miimm 

367 ACIX3GACACCGTATGGCATGGGACATGATGATG 

II II I I I I t I I I I i I I i I I 1 I I I I I I I I I I I I 1 I i t 1 1 I I I 1 1 ! ! UN Mill 
367 ACcGGgCACCGTATGGCATGGGACATGATGArGAACTGGTOG^^ 

ACtGGaCACXXrTATGGCATGGGAcAIGA^ 



SEP ID NO* 
26 

27 

28 

29 

26-29 



Isolate 
T2 

T4 

TO 
US10 
consensus 



428 TGGCGTACGCGATCCGCCTTCCCGA^ 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MIIMM I 1 1 1 1 1 1 1 lllillll 

428 TGGCCnACGCGATCCGCGTTCCCGAGCT 

1 1 1 1 1 1 1 1 f 1 1 1 1 f f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 lllillll I Mill II MIIMM 

428 TGGCGTACGCGATQCGCXnTCCCGAGaTCATCAT^ 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 mil m ii imi 

428 TGGCGTACGtGATCCGCGTTCCCGAGGTCATC^^ 
TG6C6TAC6cGAT6CGCGTTCCCGAC^3^^ 



SEP ID NO: Isolate 
26 T2 



27 
28 
29 
26-29 



T4 
T9 
US10 
consensus 



489 CGTCATGTTtGGCITGGCCTACTTCTCTATGCAG 

f 1 1 1 1 1 1 i I ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 i 1 1 1 1 1 1 1 1 !! 1 1 III MIIMM 

489 CGTCATGTTCGGCTTCGCCTACTTCTCTATGCAG 

IIMIIIIMIII I 1 1 1 1 1 1 1 M f 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 

489 CGTCAICTTCGGCC^AGCCTACITCTC^^ 

MM I I f I I 1 I I 1 I 1 I t I I I I I I t I I I I I I I I I I t I I 1 I I I I I I II I I I I I I I I I i i I 
489 CGTCtTGTTCGGCtTAGCCrACTTCrCTATGaUa^^ 

OGTCaTGTTcGGCtT - GCCTACTTCTCTATGCAGGGAG 



SBO ID NO- 


Isolate 






26 


T2 


550 


CTctTGrCTGGCtGCTGGGGTGGAOGCO 








II I 1 1 1 1 1 1 1 I 1 1 1 1 f 1 1 1 1 1 1 1 1 


27 


T4 


550 


CTtCTGCTGGCCGCTGGGGTGGACGCG 








ii mi miiii iiiiinii 


28 


TO 


550 


CTgtTGCTcaCCGCTGGcGTGGACGCG 








II MM IMIMI 1 1 f 1 1 1 1 1 1 


29 


0S1O 


550 


CTtcTGCTagCCGCTGGgGTGGACGCG 


26-29 


consensus 




CTt -TGCTggCcGCTGGgGTGGACGCG 
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FIGURE ID 

SEP ID NO: Isolate 



33 T8 1 GTGGAAGTtJUGaAACAcCAGTTtt^ 

llllllll II Mil Mill Mill Ml ! II Mil III MMMI I llllllllll 

30 DK8 1 GTGGAAGTCAGGAACATOUnTCcAGCTACTACGCC^ 

lllllllllllllllllllllll llllllll lllllllllllllllllllll I lllll 

32 SW3 1 GTGGAA(nOU3GAACATCAGTTCT 

ii iii iiiiiiiiiii iiiiiiiii inn ii i ii iiiiiiiiiii inn i iiiii 

31 DK11 1 GTGGAAGTCAGGAACAcCAG^^ 
30-33 consensus CmSGAACTcAGgAACA- CJUOTctJ^ 



SEP TP NO: Isolate 

33 T8 62 TCACCTGGCAgCTCACCaACGCAGTTCTCCACCTTCCCGGATC 

iiiiiiiiii nun 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f i 

30 DK8 62 TCACCTGGCAACTCACCgAOGCAGTTCTC 

1 1 1 1 1 ! 1 1 1 1 1 ii 1 1 1 1 mini iiiiiiiiiiiiiiiiiniiii iiiiiiiiiii 

32 SW3 62 TCACCTGGOlACTCACCAACGCAGTcCTCC^ 

lllllllllllllllllllllllll lllllllllllllllllllllll IIIIIIIIIII 

31 DK11 62 TOlCCTGGCAACTCACCAACGCAGTtCTCCACCT^ 

30-33 consensus TCACCTGGCAaCTCACCaACGCAGTtCT 



SEP XP NO: Isolate 

33 T8 123 CAATGGCACCtTGOXrreCTGGATACAAGTa 

llllllllll I I 1 1 1 1 I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I 1 1 I I I I I I 1 1 I III 

30 DK8 123 CAATCGCACCCTGCGCTCKrit^^ 

1 I I I I I I I I I I I I I II 1 I I I I I I I I I I 1 I I I I 1 I I I I I I I I I I I I 1 1 I I I I I II 1 I 1 I 1 
32 SW3 123 tiJVTGGCACCCTGCACTGCTGGATACAAGTGACA^ 

Mil III II III lllllllilllll II I III IIIIIIIII llllllllll III lllllll 

31 DK11 123 CAATCGCACCCTGCACTGCTGGATAC^AGTGACACCT 

30-33 consensus cJJVTOGCACCcTGC-CTGCTGGAIACAAGTgW 



SEP ID NO: Isolate 

33 T8 184 GCACTcACTCAcAACCTGCGAACgCAtGTCGAC 

lllll Mill IIIIIIIIIII II I I I I I 1 I I I I I I I I I I I 1 I 1 1 I I I I I I 1 1 I I I 1 I 

30 DK8 184 GCACJXACHXMdiCCTGQ^^ 

II II Mill IIIIIIIII llllllllll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 M 1 1 

32 SW3 184 GCgCTCACTCACAACCTOCGJUK^ 

II I t I 1 t I 1 I I i I I I I I I I I I I I I I I llllllll IIIIIIIIIIIIIIIIIIMII 

31 DK11 184 GCaCTOkCTCACAACCTGCXSASC^^ 

30-33 consensus GCaCTcACTCAcAACCTGCGA- CaCA- gTcGA- -TGATcGTAMtK^AGCTACGGTCTGCT 



SEP ID NO: Isolate 

33 T8 245 CGGCCTTGTATGTGGGgGACGTqTGCGGGGCCOT 

i i 1 1 i i i i i i i i i i i i iiiii iimmiiiiiiiii i Milium iiiiiii 

30 DKe 245 CGGCCTTGTATGTXaGGJUSACGTaTGCGGGGCCGTGATC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 I I I I I I I I I I I Mlllll 
32 SW3 245 CGGCCTTGTATGTGGGAOACaTGTGCGGGGCCXnX^TGATCGTGT^ 

I I I I 11 I I II MM I I II I I I I I I If I I I I I I I I I 1 I I I I I II I I 1 1 I I I I I I 1 I 1 1 I I 

31 DK11 245 CGGCCTTGTATGTGGGAGACgTGTGCGGGGCCGTGATGATCGT^ 

30-33 consensus CGGCCTTGTATCTGGGaGACgTgTGCGGGGCCGTGATGATcGtC 



31136.1 
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FIGURE ID 



SEP ID NO: 
33 

30 

32 

31 

30-33 



Isolate 
T8 

DK8 

SW3 

DK11 

consensus 



306 



306 



ATCGCCaGAACGCCACAACTTcACCCAG^ 

Mill! I! II 11 Mi M II! Mil II I II 1 1 1 III 1 1 !l III I II 1 1 M 111 HUM I 

ATCGCCtGAACGCCAGAACTTTACCCAGGAGTGCAACTGTTCCOT 



unit 1 1 1 1 1 1 1 1 i 1 1 1 1 1 



1 1 1 ! f i 1 1 i 1 1 mi tin 



Mill 1 1 1 II ] I M M i 1 11 M 1 1 1 1 M M 11 1 MM 



306 ATCGCCAGAACGCCACAACITT^ 



iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii in 



306 ATCGCCAGAACaCCACcACTTTACCCAAGAGTGC^ 

ATCGCCaGAACgCCACaACTTtACCCA- GAGTGCAACTGTTCCATCTACCAAGGTCatATC 



SEP IP HP: Isolate 
33 T8 



30 
32 
31 
30-33 



DK8 
SW3 
DK11 
consensus 



367 ACCGGCCMOK^TGGCATGGGACATGA^ 



1 1 f 1 1 1 f 1 1 1 1 1 1 i 1 1 1 1 i 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 i ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



367 ACCGGCCACCX5CATGGCATGGGACATGATGCTAAAC^ 

II 



inmiiiiiimii iiiiiii 



IIMIIMMIIIilll IIIIIII 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f ff 1 1 



367 ACCGGCCACCGCATGGCgTGGGACATGATGCT 



iiiili i 1 1 1 ll i 1 1 1 1 1 M 



! 1 1 1 1 1 1 1 1 1 



II 



II I M I II 1 1 II 



367 ACCGKKX!ACCGCATGGCaTGGGACATGATGCTtAA^^ 

ACCGGCCACCGCATGGCaTGGGACATGATCCTaAACTGGTCACCAACT 



SEP ID NO: 
33 

30 

32 

31 

30-33 



Isolate 
Tfi 

DK8 

SW3 

DK11 

consensus 



428 TCGCCTAcGCtGCTCGTGTgCCTGAaCTAGtCCTtgAaGTTGTCTTCGGCG 

IIIIIII II lllimi Mill till III I I f 1 1 1 I I I I I 1 I I I I 1 I 1 I I 1 I I 
428 TCGCCTATGCCGCTCGTGTTCCTGAGCTAGcCCTcc^ 

I 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 III I lllllllllllllllllllllll 

428 TtGCCIATGCCGCTCGTGTTCCrGAGCTAGT^ 

I illllilllll IIIIIIIIIIIIIIIMIIIIIMII 1 1 1 1 1 1 1 1 II II 1 1 1 M I 

428 TcGCCTATCCCGCcCXnXnTCCTGAGCrAGTCCTT^ 
TcGCCTAtGCcKSCtCGTGTtCCTGAgCT^ 



SEP ID NO; 
33 

30 

32 

31 

30-33 



Isolate 
T8 

DK8 

SW3 

DK11 

consensus 



489 CGTGGTGTTTGGCriXXSCCTATTTC^ 

i i 1 1 1 1 1 1 1 1 1 1 1 1 1 i i i 1 1 1 1 ii 1 1 i ii 1 1 1 1 iiiiiiiiiiiiiiiMiii hum 

489 CGTGGTGTTTGGCTTGGCCTATTTCTCCATGC^ 

1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 i i 1 1 1 h 1 1 1 1 1 1 1 

489 CGTGGTGTTTGGCTTGGCCTATTTCTCCATC 

III MMIMI Ml Mill II IIIIIII I Ml I Ml 1 1 1 MM III MM III IIIIIII 

489 tGTGGTGTTTGGCTTGGCCTATTTCTCCATGCAgGGAGCGTGG 
cGTGGTtSTTTGGCTTGGCCTAT^^ 



SEP ID NO 
33 

30 

32 

31 



Isolate 
T8 

DK8 

SW3 

DK11 



550 CTCCTcCTTGTCGCAGGAGTGGAcGCA 

HIM f 1 1 1 i 1 1 1 1 1 f 1 1 1 1 1 1 Ml 

550 CTCCTtCTTGTCGCAGGAGTGGATGCA 

HIM 1 1 ! I M M I M 1 1 1 1 1 1 1 1 1 1 

550 CTCTTgCTTCTCGCAGGAGTGGATGCA 

HIM Mill 1 1 1 1 1 II 1 1 1 1 1 1 1 ! 

550 CTCCTtCTTGTaGCAGGAGTGGATGCA 



• 30-33 



consensus 



CTCCTtCTTGTcGCAGGAGTGGAtGCA 
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FIGURE IE 



SEP TP NO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 

consensus 



tTAjGAGTGGCGGAATGTGTCcGGCCTCTAcCT 

II llilll Ml II ItMM llllllil II II 1 1 1 M IIMIIII I illlllllllll 

CTAGAGTGGCGGAATGTGTCTGGCCTCTATGTCCCT^ 

lllllll II MINI IIIIIIIIIMIIIIIII Illlllllllll Illlllllllll 

CTAGAGTCGCGGAATACGTCTGG^ 

iiiiiiiiiiiiiiiiiiiiiiiiiiiin mi mini Miiiiiiiiiiiiiiiii 

CTAGAGTGGCXXSAATACGTCTGGCCTCTATaTCC^ 

1 1 1 1 1 1 ( 1 1 1 M I f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 llllllil III lllllll I IIMIMIMI 

CTAGAGTGGCGGAATAOGTCTWCCTCTATgTCCT^ 
dTAGAGTCX3CGGAATacG^ 



SKO IP NO: Isolate 
35 PK12 



36 
37 
39 
38 
35*39 



HK10 

S2 
S54 
S52 

consensus 



62 



62 



62 



62 



62 



lllllll Ilfllllflltllll I I I I f 1 I I I I t t I 1 lllllllllll 
TTGTGTATGAGGCCGATGACGTCATTCTGC^ 

IMIIIIIIIIIIIMIIIIII IMMIMIIII I IMMIIIIIIIII 



iTGTTCAGGA 
lllllllllll 



TcOFIXnATG&GGCCGATGACGTC 

I llllllllllllllllllllllllllllllllllllllll lllllll 
TlXntnATCAGGCCGATGACGTCATTX^ 

mill ii 1 1 ii in iiiiiii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 iiiiiii I I 

TTGTGTATGAGGCCGATGACGTtATTCTGCACAC^ 



lllllllllll 
TGTGTTCAGGA 

lllllllllll 
TGTGTTCAGGA 

lllllllllll 



TtGTGTATGAGGCCGATCACXSTcATrCTGCACACA^ 



SEp jp flp; Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 

consensus 



123 GGGGAAXACATCtACGTGCTGK&CCTCaGTGACgCCT 

iiiiiiiiitii iiiiiiiiiiiiii inn 1 1 1 1 1 f r I I I I I t I I 1 I 1 1 1 f 1 1 1 I 1 ! 

123 CGGCAATAC^TCCACGTKKrPGGACCTC 

III lllllllllll III III I II I I IIIIIIIIIIIIII Illlllllllll I III II 
123 CGGtAAIACATCCACGTGCTGGACCCCAGTG 

III I I 1 I I I I f I 1 1 I 1 t I 1 I 1 1 I 1 1 1 I I I I 1 I I 1 I I 1 I IIIIIIIIIIIIII llilll 
123 CGGCAATACATCCACGTGCTGGACCCCAGTXaACAC^ 

IIIIIIIIIIIIII lllllllllll IIIIMIMIIII III MIIIIIIIIIIIIIIIII 
123 CGGC^TACATCCAtGTGCTGGACCTCAGTGACACCTAC^ 

CGGcAATACATCcAcGTGCTGGACCcCaGTGACaCCTJ^ 



SEP TP NO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 

S2 
S54 
S52 

consensus 



184 GCAACCACCGCtTCGATACGCAGTCATGTGGACCTOc^ 

milium 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

184 GCAACCACC^cTCGATAOjCAGTCATGTGGACCTC 

lllllllllll 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 ! II IIIIIIIIIIIIII lllllll 

184 GCAACCACCGCTTCGATACGCAGTCATGT^ 

i i i i i i i i i i i i f i i i i i i i i i i i i i i i m i i i i i i m iiiiiiiiiiiiii mm 

184 GCAACCACCGCITCGATACGCAGTCATXTIX^ 

i iii iii him ii i iiiiiiii i ii ii i i iii i i iii i iii i mm iii ii i mm ii 

184 GCAACCACCGCTTCGATAOGCAGT^^ 

GGAAGCACCGCtTCGATACGCAGTCATGTGGACCT 
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S52 

consensus 



245 CTGCGCTCTACGTGGGtGATgTGTGTGGGGCCGTCTTCCTtGTTC 

1 1 1 1 MilllMIMi Ml IN li Mil III Illli ! I I III INI II I III III Ml 

245 CTGCGCTCTACGTGGG cGATATGTGTGGGGCCGTCTTCCTCGTGGGACAAGCCTrC^ 

Mill llllllllill 1 1 1 1 1 1 1 II I M 1 1 1 1 1 1 1 1 I MUM IIIIIIIMI MIMI 

245 CTGCGCTCTACGTGGGTGATATGTGTGGGGCCCT 

MMIIMM MMIIMMMMMIMM IMIMI MIIIIIIMIIMMMMII 

245 CTGCGCrCTATGTGGGTGATATGTCTGGGGCCGTCT^ 

1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 ! 

245 CKXXKrrCTATGTGGGTGATATGTGTC^^ 
CTGCGCTCTAcGIX3GGtGATaTGlX31KK3GK 



SEP ID WO: Isolate 
35 DK12 



36 
37 
39 
3B 
35-39 



KKIO 
S2 
S54 
S52 

consensus 



306 CAGACCtOSTCGCCATCAAACaGTCCAGACCT^ 

MIMI llllllllllllll MMIMMMIMIMMMMIIMMMIMI III 

306 CAGACCgCGTCGCCATCAAACGGTCCAGACCI^ 



llllll IIIIIMIIIIIIIIIMMMMM 



1 1 1 1 1 1 II II 1 1 1 1 1 ! 1 1 1 1 1 1 1 Ml 



306 C^UjACCTCGTCGCCATCAAACGGTCCAGACCTCT 

MM Ml IMIMMIMMIMMMMMM lllflflflllllllllllllllllll 

306 CAGACCTCX3TCXjCCATCAAACX5GTCCAGACCT 

iiitiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiniiiiiii ii 

306 CAGACCTOGTCGCCAXOUUttXCT^ 

CAGACCtCGTCGCC^TCAAAGgGTCCAGACCIK^AA 



SEP ID NO: Isolate 
35 DKI2 



36 
37 
39 
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35-39 



HK10 
S2 
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S52 
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367 TCAGGACATCGAATGGCITGGGATATGATGATGAATTG 

IMIMIIIMIIIIIIIIIIMIIIIMIIMIIIMIIIMIIM Millllltllll 

367 TCAGGACATCGAATGGCITGGGATATGATGATGAATTG^ 

llllllllill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 i i i 1 1 1 1 1 ! 1 1 Mlllllllllll 

367 TCAGGACATCGcATGGCITGGGATATGATGATGAA^ 

llllllllill 1 I I 1 1 I 1 I 1 ! I I 1 1 ! 1 1 1 1 1 1 I I 1 I f I 1 1 I 1 1 1 I I 1 1 1 1 I I I i I 1 I f 1 I 
367 TCAGGACATCGAATGGCTTGGGATATGATGATGAATT^ 

1 1 1 1 1 1 1 i 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 i 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 i I 

367 TCAGGACATCGAATGGCTTCGGATATGATGATC 

TCAGGACATCGaATGGCTIXSGGATATGATGATGAATTGGTC 
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37 
39 
38 



HK10 
S2 
S54 
S52 



428 TaGCGCACGTCCTGOGtcTCCCCCAGACCTTGTTCGACATA^ 

I llllllllllllll IIIIIIIMIIIIIIIIIIIIIIIIIII llllllllllllll 
TGGCGCACGTCCTGCGgTIXKrCCa^GACCITGT^ 

1 1 1 1 1 1 1 1 1 1 inn iiiiimim i ii n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n ii 1 1 

TGGCGCACGTtCTGCQtTIX3CCCC3tf3ACCgTGTTC 

1 1 1 1 1 1 1 1 i inn 1 1 1 1 1 1 1 1 1 1 1 1 mi mill i f 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 

428 TGGCX3CACATCCTGCGATTGCCCCAGACCT1GTTTC 

IMIMI MIIMIMIMIMI MIMMIMM I III III llllll Ml III MIMII 

428 TGGCGGACATCCTGGGAXTGCOrCA^ 



428 



428 



35-39 



consensus 



TgGCGCACgTcCTGCG - tTGCCCCAGACCtTGTTcGACATAaTaGCcKjGGGCCCATTGGGG 
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36 
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39 


S54 


489 


38 


S52 


489 


35-39 


consensus 





CATCaTGGCg^CCTAGCCTATTACTCCATGCAGGGCAACTGGGC CAAGGTCG CTATCATC 

UN Nil INIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMII 

CATCTTGGCaGGCCTAGCCTATTACrCCATGCAG 

MnUILUJ N/MlilMMMIIIIIIIIIIIIIINIIIIIIIIIIIIIIillllll 

CATCTOG(K!G6GCCTAGCCZA!CTACTCCATCCAaGGCAACTGGGCCAAG6TCX3CTArCATC 

JLilUJJUJLU'JLLl 1 » 1 1 1 1 1 1 1 ii inn iiiiiiiiiiiiiiiiinmimi 

CATCTIXSGaSGGCCTAGCCTATTATTCTATGCAGGGCARCTGGGC 

i.i4JLUXUJJJJULU' XU m 1 1 1 1 1 1 in 1 1 1 1 ii m 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 n 

C^TCTTGGCGGGCCTAGCCTATTATTCTO 
CATCtTGKX^a^KSCCTAGCCTATTAcTCcAT^ 



SSO ID NO: jTsolate 
35 DKL2 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 
consensus 



550 ATGGTTATGTTTTCAGGaGTCGATGCC 

MIMIMIIIIIIIII lllllllll 

550 ATGGTTATGTTTTCAGGGGTCGATGCC 

1 1 1 1 1 1 1 MMMJL 1 1 i 1 1 1 1 M III 

550 ATGGTTATGmTCAGGGGTCGAcGCC 

III Mi iMiiii iiiiinii in 

550 ATGATTATGTTTTCAGGGGTCGATGCC 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 H 1 1 1 1 1 

550 ATGATTATGTTTTCAGGGGTCGATGCC 
ATGgTTATGTTITCAGGgGTCGAtGCC 
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SEP ID NO: Isolate 
43 Z7 



42 



26 



42-43 consensus (Z6) 



1 GTcATlCTATCaCAATGCCTCGGGCGTCTATCACATCACCAACGACTC 

11 lllllll IIIIHINIIIIIIIilllll I f 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 f 1 1 

1 GTtlAACTATCGCAATKCTCGGGCGTCTATCACGTC 

GTUUVCTATCgCAATGCCTCGGGCGTCTATCACgTCACC^ 



SEP ID WO; Isolate 
43 27 



42 



26 



42-43 consensus (Z6) 



€2 TAaTGTATGUUSGCCGAACACOlCATCCrAC^ 

II IMIIMIIIIIIIIIIM II! IIIIIIIIIIIMIIII I llllllllillll 

62 TAGTGTATGAGGCCGAACACC^gATCTTACACCTCCC^ 
TAgTCTATGAGGCCGAACACCAgATCtTACACCTC 



SEP ID WD: 
43 

42 



Spplate 
Z7 

26 



123 gGGGAACCAGTCACGCTGCTGGGTGGCCCTTACTC cCTTATATCGGT 

UNI I ! 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 i 1 1 M 1 1 I IIIIIIIIMI 

123 tGGGAAtCAGTCAaKrrGCTGGGTGGCCCTTACTC 



42-43 consensus (Z6) 



tGGGAAtCAGTCACGCTGCTGGGTGGCCCTTACTCCCACCGTGGCGGt^ 



JP PQ;. Isolate 
43 Z7 



42 



Z6 



42-43 consensus (Z6) 

SEP ID NO: Isolate 
43 Z7 



42 



26 



42-43 consensus (Z6) 



184 GCaCTCCTTGAaTCCaTCCGGAGACATGTGGACCTGATC^ 

ijiMiiiii in 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 inn inn ii mi 

184 GCTCXX3CTTGAcTCCrt^CGGAGACATCTGGACCT 
GCtCCGCITGAcTCCcTCCGGAGACATGTGGACC^ 

245 CcGCtCTCTACaTIXXXK3ACCTGTGCGa 

I 1 1 Mil M I M I II I M 1 1 1 1 1 1 M I III 1 1 1 1 1 1 1 1 MINIM II || 

245 CtGCCCTCTACg^TGGAGAtCTOTGCGGTG^ 
CtGCCCTCTACgTTGGaGAtCTGTC^ 



SEP ID NO: Isolate 
43 Z7 



42 



Z6 



42-43 consensus (26) 



306 CCAGCaXXSACGCOOTGG^ 

1 1 1 1 1 i 1 1 1 1 1 1 i 1 1 1 1 i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i I HIM || Mill I 

306 CCAGCOGCGAO GCC ACTGGACTAOGCAGGACroCAATTGCT 
CCAGCCGGaACGCCACTGGACTACGCAGGACTGCAATTGTTC 



SEP ID NO: Isolate 
43 Z7 



42 



26 



367 ACaGGCCACAGaATGGCATGGGACATGATGATGAACTOGAGTC 

II I 1 t I I I 1 I I I I I M I I I I I I I I I I I I I I I I 1 I II I M I 1 I 1 I i I I I I ( i I M l | 
367 ACgGGCCACAGgATGGCATGGGACATGATGATGAACTGGAGTC 



42-43 consensus (26) 



ACgGGCCACAGgATGGCATGGGACATGATGATGAACrG 
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FIGURE IF 



SEP ID WO: Isolate 
43 27 



42 



Z6 



42-43 consensus (26) 



428 TCGCCCAGGTtiATGAGGATCCCTAGCACT 

Mllllllll III IIIIIIMIItlMIMMI II llllil IIIIIIIIIIIIIMI 

428 TCGCCCAGGTcATGAGGATCCCTAGCACTCTGCT 

TCGCCCAGGTcATGJlGGATCCCTTUSC^CTCTGGTaG^ 



SEP ID WO- Isolate 
43 Z7 



42 
42-43 



26 
(26) 



489 taTCXTTTaTcGGGgrGGCaTACITCtGCATGCAAGCT 

l, JJJJJ. n ,m 1,1,11 ' Miiiiiiiiiiiiiiiiii inn linn 

489 CgTCCnKSTTCGGtTGGCCTACTTCAGtATGCA^ 
cgTCCTTgTtGGGtTGGCgTACTTCaG 



SEP ID WO: Isolate 
43 Z7 



42 
42-43 



26 
(Z6) 



550 CITITCXrrcTaCGCrGGAGTTGATGCC 

Mllllllll I f 1 1 1 f f 1 1 1 1 1 1 1 1 1 

550 CTTTTCCTCTTCGCTGGAGTTGATGCC 
CTTTTCCTCTtOGCTGGACtTTGATGCC 
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123 
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123 
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123 


48 


SA6 


123 


45-50 


consensus 





GTtCCCTACCGgAATGCCTCTGGGGlTTAcCIATGTCACCAATGAcTGCCCAAACTCcTCCA 

JJ'u! LUJ„' ' ' ' ' ' f ' ! ' ' '"in iiiiiiiiiiini iiiiiniiii mi 

IIMMIMIMMIIIMI f 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 f 1 1 f 1 1 1 1 J ! I IIIIIHIII 

GTCCCCTACCXaAAATGCCTCcGGGGTTTATCATGTCACCAA 

MINI 1 1 1 1 1 1 1 Mill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Til 1 1 1 1 1 1 1 1 

GTTCCCTACCGAAAcGCCTCTGGGOTlTA^ 



OTTCCCTACGGAAATGCCTCTXra 

IMII Mill IIIIIIIIMIIM Mllllll lllllllllllllllllllllllll 

GTTCCtTACXXSgAATGCCTCTGC^^ 

GTt CC CTACCGaAAt GCCTC tGGGGT tTAt CATGTcACOUVTGAtTGCCCaAACTC t TCCA 



iiiMiiiiiiiiiiiii nun mimiimi 1111111111111111 i i 

TAGTCTACGAGGCIGAXAAOCTGATtCroCAOGCACCTGXjra 

IMIMI Mllllll Mllllll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 Ml^l 

TAGTCTAtGAGGCn»AcJUlCCTX»TCC^^ 

INI II llllllll lllllllll Mil lllllllllllllllll Mllllll II 

TA6TtT3\OGAGGCTX^TAACX7TG2n > CTTGCAtGCACCrGGTT606TGCCtTGTSTCAGGCA 

i ii iiiiiiiiiiii iiiiiiiiii ii iiiiiiiiiniiiiii inn inn 

TcGTCTACGAGGCTGATGACCTGATCTTACJUaSCACCTGGTTGCGTC 

I Mill IIIMIIIIIIIIIIIII 1 1 f 1 1 1 1 i 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 || | 

TaGTCTAtGAGGCTGA!IX5AGCT6ATCcfrACA 

TaGTcTAcGAGGCTGAtaaCCTGATc - TgCAcGCACCTGG tTGCGTGCCcTGTGTcaggcA 



AGaTAATGTCAGTAGCTGCTGGGTCCAAATCAC^ 

ii >JL X 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 iiiiiiiiiiii i inn 

AGgTAATGTCAGTAGGTGCIXK5GTCCAA^ 

I M M 1 1 M 1 1 M M 1 1 1 1 M 1 1 M 1 1 1 M M II I M 1 1 M M M ! f I! I M M M ! 1 1 

AaATAA3CTCAGTAGGTGCTGGGTCCAAATCACC^ 

i iiiiiiiiiiii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 mm 

AGATAATGTCAGTAaGTGCTGGGTCCAAATCACCCCCACq 

I NIMH III I IIIIIIIIIIII I f 1 1 1 1 1 1 M I IIIIIIIIIIII 1 1 1 II I 

GGgTAAIXnCACTAGGTCCTGGGTCCAgATCACCCCCA 

II iiiUUiUU IXJJdJJL" MMIIIIMIMI 1 1 1 1 1 1 ! 1 1 ! 1 1 1 ( I ! 1 1 

GGaTAATGTCAGTAGaTGCTGGGTtCAtATCACCCCCACACT 
agaTAATGTCAGTAggTGCTCGGTc£AaATCAC^ 
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SEO ID NO: 


Isolate 


45 


SA1 
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SEO ID NO: 


Isolate 


45 




47 


SA5 
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pEQ ID NO; 


Isolate 


45 


SA1 


47 


SA5 


49 


SA7 


46 


SA4 


50 


SA13 


48 


SA6 


45-50 


consensus 



184 GCGGTCACGGCTCCTCTTCGGAGGGcCGTTGACTACTT^ 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 in ii 1 1 ii i in ii i iii inn i in ii i 

184 GCGGTCACGGCTCCTCTTCGGAGGGtCGTTGACTACTTAGCGGGAGG 

lllllllllllllllllllllllll llllllllll llllillllllll IMIMIIIM 

184 GCGGTCACGGCTCCTCITCGGAGGGCCGTTGACTACcTJWj 

I IMIimillll Mil IIIIMMIMII MM I I lllllllll I Ml III MINI I 

184 GCXKSTCACGGCaXXTlCTTCGGAGGTC 

MMMIMMMMMMMMIIMMMMIIMMIMM lllllllllll Mil 

184 GCGGTCACGGCTCCTCTTCGGAGGGCCGTTGACTACT 

MMMMMIMMMMMMMMMMI Mill Mill Mill Mill MM 

184 GCGGTCACGGKTTCCTCTTCGGAGGGCC^^ 

GCGGTCACGGCTCCTCTTCGGAGGGcCGTTGAcTACtTaGCG 



245 CaK^VCTATACGTCGGc^ACGa?^^ 

MIMMIMMMM IIMMIIIIMIIMMM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 

245 CCGCACTATACGTCGGGGACGanXSCGGGGCAG w l^ 

Mil Mill MMMMMMMMMMM IMM lll llllllll 1 1 1 1 1 1 1 III 

CCGCgCTATACGTCGGGGACGCGTGCGGGGCA Gl^ 

Mil 1 1 1 1 1 M I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 11111111111 1 1 1 1 1 1 1 NiMH III 

CCGCaCTATACGTCGGGGACGCGlKX^GGGGCAGTGTTTT 

mi iiiiiiiiii 1 1 1 1 1 1 1 i 1 1 n 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 n iiiiiniiiiiii 

CCGOGTTArAOGTCGGAGACXaCGTGKXXKjGCA tf 

1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 If llllllllll llllilll lllll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CaKXHTimkCXrrC^GAGAOGtGTGOGGGGC^ 



245 



245 



245 



245 



CCGC - cTATACGTCGGgGACGcGTGCGGGGCAgTGTrttTGGTATO 



306 TAGGCCTCGCCAGCATACcACaGTGCAGGACIGCRA C 

I 1 I I 1 1 I 1 I i I I I I I t 1 t II I I I I 1 I 1 1 1 I 1 I I 1 1 I I I llllllllll lllllllll 
306 TAGOeXHXXSCCAGCaTACTfta^ 

i 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 MM Ml I IIMIIIIIIIMIMIIIMIIIMI lllllllll 

306 TAGGCCTCGCCAGCACACTACGGTGCAGGACTGCAACTGT^ 

M 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 llllllll II I MIMMMMMIMM 

306 TAGGCCTCGCCAGCACACTACGGTGCAaGACTGCAAtTGcTCt^ 

III lllllll III I I lllll llllllll II I 1 I M I I I I I I M 1 I III 
306 TAGcCCTCGCCgGCATAaTgttGTGCAGGACTGCAACTGtTC 

in iiiiiii mi i ii 1 1 1 1 1 m 1 1 1 1 1 1 1 iiiiimiiimii III 

306 TAGgCCTCGCCaGCATgctfacgGTaCAGGACTGCA^ 



TJUSgCCTCGCCaGCAtactacgGTgGAgGACT^ 

367 ACC^GCCACCGgATGGCtTGGGACATGATGATGAA 

lllllllllll lllll 1 I I 1 I I I I I I 1 1 1 1 I I I I ! I II I I f I 1 I I I I 1 I 1 I 1 I 1 1 III 
367 ACCGGCCACCGAATGGCATGGGACATGATGATGAA1TCGT 

lllllll III I II lllllll III I III III IMIIIMIMMIIIMII lllllll nil 

367 ACCGGCCACaSAATGGCATGGGACATGAT^ 

lllllllllll i II I I M I I I I I M I I M I I I I If I M I I I I I I I I I I I mm III 
367 ACCGGCCACCGGATGGCATGGGACATGATGATGAATTGtf 

MIMIMM I IMMIMI MIIMIMM III IMMMMMM II II III III 

367 ACCGGCCACOGGATGGCATGGGACATGATGATGAATTGGT^ 

n 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i inn mini 

367 ACtGGCCACCGGATKCATGGGACATGATGATGAATTC 

ACcGGCCACCXjgATGGCaTGGGACATGATGATGAATTGGTC^ 
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FIGURE 16 



TGGCCCTlGaTGCTACGGATcCCCCAgGTGGTC 

1 1 1 1 1 1 1 1 MINIMI! Mill IIIIIIM IIIIIIM 1 1 11 1 II I INI1I II. 

TCGCCCAGgTGCTACGGATTCCCCAaGTGGTCATtGACATC 

MINIM MIMIMIMMIII IIIIIIM IMIIMIMIMMIIMIIMII. 

T^CCAGTTCCTA 

M I M M I M 1 1 1 i 1 1 1 1 M M M M M I i 1 1 M M 1 1 1 M 1 1 1 1 ! M 1 1 M 1 1 1 1 1 1 1 1 . 

TGGCCCAGTTGCTACXSCa^^ 



IMIMMIII 1 1 ! 1 1 M I M M 1 1! M I M I 



428 TCKaaunrctTAasGArra 



lllllll II lllllllliilllllllllll 



llllllllllllllll llllllll 



M II M II M II I M M llllllll 
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consensus 
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consensus 





SA6 428 TGGCCCAaaTGcTACGGATTCCCCAG^^ 

TTCCCCAgtTGcTACGC^TtCCCCAgGTGCT 



GGTCTIXnTtGCCGcCGCATACTTtGCGTCgGCcGCcAAC^ 

IN Ml Ml MM II II II III Mill II II IMIMIIMIMI llllllll 



II 



GCTCITGTTCGCOGtCGCATACTTCGCGTC 



iiiiiiii!!! nun i n 1 1 1 1 1 n 1 1 1 1 1 1 1 m n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



GGTCTTGTTCGCCGCOGCATATTTCGCGTCAGCGGCTJ^ 

iUJMjn f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Miiiii 

GGTCTTClTtGCOGCCGCATATTTaXXm^ 

MM III 1 1 I II I III 1 1 11 I MUM J 1 1 f 1 1 1 1 1 1 f f 1 1 HUM I 1 1 1 1 1 1 

GGTCTTSTTCKKXGC^ 

J J ' J ' 1 11 1 11,1111 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

GGTCTTGTTOGCOGCtGCATACTtOGCGTCGGOGGCTAACTGGGC^ 
GGTCTTGTTcGCCX3ccGCATAcTt cGCGTC -GCgGCtAACTGGGCtAAGGTtg^PgCTCGTc 



CTGTTcCrGTTTGCGGGGGTCGATGGC 

HIM lllllll llllllllllllll 

CTGTTTCTX3TTTGCGGGGGTCGATGGC 

MIMIIHIIM IIIIMIIIM I 

TIGTTTCTGTTTSCGGGGGTCGATGCC 

MIIMIIIIIIIIIIIMIMIMM 

TTGTTTCTGTTTGCGGGGGTCGATGCC 

1 I I I i I I II It I 1 1 1 1 I I I I 1 I I I I I 
cTGTTTCTGTTTGCGGGGGTCGATGCC 

-TGTTtCTGTTTGCGGGGGTcGATGcC 
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FIGURE 1H 
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SEO ID NO: 


Genotvoe 


30-33 


(IV/ 2b) 


34 


(2C) 


26-29 


(IZI/2a) 


35-39 


(V/3a) 


9-25 


(II/lb) 


1-8 


(I/la) 


40 


(4a) 


42-43 


(4c) 


44 


(4d) 


41 


(4b) 


45-50 


(5a) 


51 


(6a) 



1 GTGGAAGTcAGg AACA t CAGTTc tAG cTACTAcG CCACCAATGATTG CTCaAACAaCAGCA 

1 GTGGAGGTCAAGGACACCGGCGACTCCTACATGCCGACCAAC 

1 GcccAAGTGAagAACACCAgtacCaGcTAcATGGTGACcAAcGACTC 

1 cTAGAGTGGOSGAATacGTCtGGCCICTAtgTC 

1 tAtGAaGTGCgCAACGTgTCCXjGGgtgTAccAtGTCACgAAcGACT 

1 tACCAAGTgCGCAACTCcaCgGGgCTtTACCATGTcACCAATGAtTC 

1 GAGCACTACCGGAATGCTTCGGGCATCTATCACATC 

1 GTtAACTATCgCAATGOCTCGGGCGTCT^^ 

1 TACAACTATCGCAACAGCTCGGGTGTCTACCATC 

1 GTGCACT3*£CGGAATGCTTCGGGCGTCTATCATGTCA 

1 GTtrccTACCGaAAtGCCTCtGXKSGTtTAtCATGTcACCAATGAtTC 

1 CTTACCTTICGGCAACTCCAGTGGGCTATAXX^ 



TA 



AC AA GA TG C AA 



62 TCACCTGGCAaCTCACOiACGCAGTtCT 

62 TCGTTTGGCAGCTTGAAGGAGCAGTGCTTCATACT 

62 TGACCTCXXJlaCTccAgGCcG^ 

62 TtGTGTATGAGGCQ3ATGACGTcATTCTCCACACACCt 

62 TtGTGTatGAggCAgcgGACaTGAT^TCOVcACcCC 

62 TtGTClACGAGgCgGCcGATgCcATcCTgCAca 

62 TAGTCTATGAAGCTGACCATCACATCCTACACTTGCCGGGG 

62 TAgTGTATCAGGCOGAACACCAgATCtTACAC^ 

62 TAGTCTATGAAACCGATCACCACATCTTACACCTC 

62 TAGTGTACGAGAGGGAGCACCAO*TCA!IGCACTI^ 

62 TaGTcTAcGAGGCTGAtaaCCTGATctTgCActfCACCTTC 

62 TCX3TGCIX 5G ftGGCGGATGCTftIG Al^ 



T T CA 



CC GG TG T CC TG G 



123 cAATGGCACCcTGCgCTGCTGGATACAAGTgACACCTA^ 
123 CGCX3JtCGTCTCT05ATGT^^ 

123 gGGAAAtaCaTCtCGgTGCTGGATACCGGTctCaCCAJUlcGTg^ 
123 CGGcAATACATCc^lcGTGCTGGACCcCaGTGACaCCTACaG^ 
123 gaacAActcCTCccgcTGcTGGGTaGCGCTcaCtCCCACgCTcGC 
123 GGgTaaCgcctCGAggTGTTGGGTGgCGgTGaCCXXCACgGT^^ 
123 TGGGAACTVJCATCGanTGCTGGACGCCG^ 

123 tGGGAAtCAGTCACGCIGCTGGGTGGCCCTTACTCCCACCGTG^ 

123 AGGGAACAAGTCTACATGCTGGGTGTCTCTCACTC 

123 GGAGAATACTTCTOGCTGCTGGGTGCCCXT^ 

123 agaTAATCTCAGTAggTGCTGGGTcCAaATCACCCCCAC^ 

123 CGATCATCGGTCCACCTGTTGGCATCCTGTGA 



1-51 



TG TGG 



T C CC A T C 
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SEO ID NO: 


Genotvoe 




30 


-33 


(IV/2b) 


184 GCaCTcACTCAcAACCTG CGAaCaCAt gTcGAcaTGATcGTAATGGCAGCTACGGTCTGCT 




34 


(2c) 


184 GCTCTCACTAAGGGCCTGCGAGCACACATCGAT^^ 


26 


-29 


(III/2a) 


184 GCcCTcACGCWjGGCTTGCGGACgCACATcGACAT^ 


35 


-39 


(V/3a) 


184 GCAACCACOGCtTCGATACGCAGTrCATGTGGACCTatTaGT^ 


9 


-25 


(Il/lb) 


184 gTCcCcACtAcGaCaATACGACgcCAcGTCGAtTTGCTCGTT^ 


1-8 


(I/la) 


184 CTCCCc^a^GGCAgCrtaSACXSTcACATCGA 




40 


(4a) 


184 GCTCCGCTTGAGTCGTTCCGGCXjACATGTGC^ 


42-43 


(4c) 


184 GCtCCGCTTGACTCCcTCCGGAGACyiTGTGG 




44 


(4d) 


184 GCTCCGCTTGAGTCTTTGJU^CGTCACGT^ 




41 


(4b) 


184 GCACCXnTAGJUn , CCATGCGCAGGCAT^ 


45 


-50 


(5a) 


184 GCXMTCAOGGCrCCTCITCGGAGGGcCGTTGA 




51 


(6a) 


184 Aa3CCCGCAACGGGATrCCGatfK3CATGTGG^ 



1-51 


consensus 


TG T GA TG GC TTGT 


SEO ID NO: 


Genotvoe 




30-33 


(IV/2b) 


245 CXKSCCTTGTATGTGGGaGACgTgTGCGGGGCCGTGA 


34 


(2C) 


245 CTGCCCTTTATGTGGGGGAC^aTGTGTC 


26*29 


(III/2a) 


245 CcrGCtCTtTACGTGGGGGAccTCTGCGGcGGGgTgA^ 


35-39 


(V/3a) 


245 CTXS^CTICTAcGTGGGtGATaTGTGTGGGGC^ 


9-25 


(Il/lb) 


245 COGcUlTGTAcGIGGGgC^tCTcnXKXKSaTCt 


1-8 


(I/la) 


245 CGGCCCTCTAcGTGGGGGACtTGTGCGGGTCTGTCITtCT 


40 


(4a) 


245 CTGCCCTCTA!IXnTGGGGACCTCIt3CGGAGGTGCC^ 


42-43 


(4C) 


245 CtGCCCTCTACgTTGGaGAtCTTnGCGGTGGtGcATTCCT 


44 


(4d) 


245 CC&CCCTCTACATCGGJUaAGGTGTGTGGGGGT^ 


41 


(4b) 


245 CaXXrrTCTACATTGGAGATCTGIXnGGA 


45-50 


(5a) 


245 CCGCgc^ATACGTCGGgGACGcGTGCXXKXSCAgTGTI^ 


51 


(6a) 


245 CATCCCTGTACATCGGGGACCTGTGTGGCT 


1-51 


consensus 


C TTA TGGGA TG GG TT CA T 


SEO ID NO: 


Genotvoe 




30-33 


(IV/2b) 


306 ATCGCCaGAACgCCACaACITtACCCAaGAGTGCAACTGTTC 


34 


(2C) 


306 GTajCCACAACACCATAC^lVlX^lCCAGG 


26-29 


(III/2a) 


306 CTCGCCXSCaaCacOlCTgGTTTGTGCAaGAaTGC^ 


35-39 


(V/3a) 


306 CAGACCtQ^TCGCCATCAAACgGTCCAGACCTGTAACT 


9-25 


(Il/lb) 


306 CTCgCCtCX3c£ggcAtgaGACagt:aCAGgAcTCcAAcTC 


1-8 


(I/la) 


306 crrctCCX^GgCgCCaCrGGACaACGOlaGa 


40 


(4a) 


306 TCGGCCGCGTCGCCACTGGACCAOGCAGGAGrrGCAATT^ 


42-43 


(4c) 


306 CCAGCCGCGAaX:CACTGGACTACGCAGGACT^ 


44 


(4d) 


306 CCAACCTCGCCGCCACTGGACCACCCAAGACTGC^ 


41 


(4b) 


306 OOSACCXKXXXaK^CrGGACC^ 


45-50 


(5a) 


306 TAGgCCTCGCCaGCAtactacgGTgCAgGACTGCAAcTGtTCcATTO 


51 


(6a) 


306 TC^aXCGCTGTCATTGGAClGT^ 


1-51 


consensus 


CC C CA TG AA TG TC TTA GG T 


SEO ID NO: 


Genotvoe 




30-33 


(IV/2b) 


367 ACCGGCGACCGCATGGCaTG&GACATGATGCT 


34 


(2c) 


367 ACGGGACACCX5CATGGGTTGGGATA1GATGATGAACTGG 


26-29 


(III/2&) 


367 ACtGGaCACCGTATGGCATGGGAcATGATGATGAACTOGTC^ 


35-39 


(V/3a) 


367 TCAGGACATCXSaATGGCZTGGGATAT^ 


9-25 


(Il/lb) 


367 tCAGGTCAcCGcATGGCtTGGGAtATGATGATGAAcTGGTC 


1-8 


(1/la) 


367 ACGGGtCAcCGcATGGCaTGGGATATGATGATGAACTGGTCCCCtACgaCg 


40 


(4a) 


367 ACCGGCCACAGGATXjGCGTGGGACATGATGATGAACTGGAGC^ 


42-43 


(4c) 


367 ACgGGCCACAGgATGGCATGGGACATO^TGATGAACT 


44 


(4d) 


367 ACAGGACACAGAATGGCTTGGGACATGATGATGAATTGGAGCC 


41 


(4b) 


367 TCGGGCCACAGGATGGCCTGGGACATGATGATGAACTGGAG 


45-50 


(5a) 


367 ACcKK^CACCGgATGGCaTGGGACATGATGA 


51 


(6a) 


367 ACCGGCCACAGGATGGCITGGGACATGATG^ 


1-51 


consensus 


C GG CA G ATGGC TGGGA ATGATG T AA TGG CC C T T 
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FIGURE 1H 

SEP ID NO: Genotype 



30-33 (IV/2b) 428 TcGCCTAtGCcGCtCGTGTtCCTGAgCTAGtC 

34 (2c) 428 TCGCGTACTIGGrGOX^CCCGGAAG^ 

26-29 (III/2a) 428 TGGCGTACGcGATGCGCGTTCCCGAGGTCATCaTA 

f 35-39 (V/3a) 428 TgGCGOlCgTcCrGCGttTCCCCCAGACCtTGTTcKSACAT 

9-25 (Il/lb) 428 TaTOSCAgtTaCrcCGgaTCCC^CAAGCrrgrrc 

1-8 (I/la) 428 TaGCtCAGCTGCTCcGGaTCCCgCAaGCCaTCTTGGAcATC 

' 40 (4a) 428 TCGCCCAGATCATGAGGGTCCCCACAGCCTTTCT 

42-43 (4c) 428 TCGCCCAGGTcATGAGGATCCCTAGCACTCra 

44 (4d) 428 TCGCCCAACTTATGAGGATCCCAGGCGCCATGGTCGACCT^ 

41 (4b) 428 TGGCTCAGATCTTTACGGATrcCCTOT 

45-50 (5a) 428 TGGCCXAgtTGcTAaMAXtCCCCAgGTGGI^ 

51 (6a) 428 TATCTAGCATCTTGAGGGTACCTGAGATTTGTG 

1-51 consensus TC GTCC TTGGGCA TGGGG 



CI/A T WO • 

0AU XU Kt\J . 








30-33 


(IV/2b) 


489 


CGTGGTGTTTGGCTTWCCTATTTCT 


34 


(2c) 


489 


TGTAATGTTTGGCCTCGCTTACITCT^ 


26-29 


(III/2a) 


489 


CGTCaTGTTcGGCtTaGCCTACrrCTCTATGC^ 


35-39 


(V/3a) 


489 


CAlt^tGGCgGGCCTAGCCTATCAc^ 


9-25 


(Il/lb) 


489 


agTCCTgGa3GGCCTtGCc?rACTAtTC^ 


1-8 


(I/la) 


489 AGTCCTaGCGGGCATAGCGTATTTcTCCATGG tGGGgAACTGGGCGAAGGTCcTggTaGTg 


40 


(4a) 


489 


CGTCCTCGCGGGCTTGGCGTACTTCAGCATGCAAGGCAAT^ 


42-43 


(4C) 


489 


cgTCCTTgTtGGGtTGGCgTACTTCaGtATC 


44 


(4d) 


489 


GA!l^LTGtfLTG(&AXAGCCT^ 


41 


(4b) 


489 


Abn'ITJlUXaCTGGTC^^ 


45-50 


(5a) 


489 


GGTCTTGTTcGCCGccGCATAcTtc*XX^ 


51 


(6a) 


489 


GATACTACTAGCOnTGCCTACTTTGGC^ 


1-51 


consensus 




T T G GC T T TGG AA GT T 


SEO TD NO: 


Genotvoe 






30-33 


(IV/2b) 


550 


CTCCTtCTTGTcGCAGGAGTGGAtGCA 


34 


(2c) 


550 


CTCCTGCTGACTGCTGGGGTGGAGGCG 


26-29 


(III /2a) 


550 


CTttTGCTggCcGCTGGgGTGGACGCG 


35-39 


(V/3a) 


550 ATGgTTATGrrrrrTCAGGgGTCGAtGCC 


9-25 


(Il/lb) 


550 


aTGCTACTcTTTGCcGGcGTtGAcGGg 


1-8 


(I/la) 


550 


CTGtTGCTgTTtgCCGGCGTcGAtGCG 


40 


(4a) 


550 


CTTTTCCTCTTTGCTGGGGTAGACGCC 


42-43 


(4c) 


550 


CTTTTCCTCTtCGCTGGAGTTGATGCC 


44 


(4d) 


550 


CTGTTTCTCTTTGCTGGAGTCGACGCT 


41 


(4b) 


550 


CTATTCCTCTTTGCCGGGGTCGAGGGA 


45-50 


(5a) 


550 


tTGTI^CTXSTTTGCGGGGGTcGATGcC 


51 


(6a) 


550 


CTGTTCCTATTTGCAGGGGTTGAAGCA 


1-51 


consensus 




T T T C GG GT GA G 
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FIGURE 2A 



seo id no 

56 
52 
59 
55 
54 
53 
58 
57 
52-59 



Isolate 
S14 

DK7 

US11 

DR4 

DR1 

DK9 

SW1 

S18 

consensus 



1 YQVRNSTGL YHVTNDCPNS S I VYE tADAILHaPG CVP CVREGN t S RCWVAMTPTVATRDGK 



1 YQVRNSTGLYHVTNDCPNS S IVYEAADAILHTPGCVPCVREGNvS RCWVAMTPTVATRDGK 



1 YQ^WSTGLYHVTinDCPNSSrvnfEAADAILin^GCTOCTO 



1 HQVRNSTGLYHVTNDCPNS S rWEAADAILHTPGCVPCVMGNt SRCWVAVTPTVATRDGK 



1 HQVRNSTGLYHVTITOCPNSS IVYKAADAILHaPGCVPCV^ 
1 YQVRNSSGLYHVTTTOCPNSSI\rYKAM 
1 YQVRNSSGLYHVTNDCPKSSIVYETiU3AII£SPGC^ 
1 YQVRNStGLYHVTOT)CPNSSrVYETiU3tIIiHSPGCVPCTO 
yQVRNStGLYHVTNDCPNSSIVYEaADalLH-PGCVTCV^ 



SEO ID NO 
56 

52 

59 



54 
53 
58 
57 
52-59 



Isolate 
S14 

DK7 

DS11 

OR4 

DR1 

DK9 

SKI 

SI 8 

consensus 



62 LPatQLRRylDIiVGSATLCSJU^YVG 

62 LPTaQLRKHIDIiLVGSATI£SALYVGDL^ 



62 LPTTQLRRHIDLLVGSATLCSALYVGDUX3SVra^ 
62 LPTTQXJIBH^ 

62 LPTl^LR^IDLLVGSATLCSJUiYVGDLra 
62 ' LP ATQLRR^Dzlw 



62 LPATQLREHIDLLVGSATLCSALYVGDLCGSVFLVSQL^ 

f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f r 1 1 1 1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

62 LPATQIJlRHIDIjLVGSATLCSi^YVGDLCGSVFLVSQLFTi S PRRHWTTQDCNCSrYPGHI 
LP -tQLRRhlDLLVGSATLCSALYVGDLCGSVFLVg^ 



SEO ID WO: 
56 

52 

59 



54 

53 
58 
57 
52-59 



Isolate 
S14 

DK7 

nsii 

DR4 
DR1 
DK9 
SW1 
S18 

consensus 



123 TGHRMAWDMMMNWSPTTALVVAQLLRIPQAIMMI^^ 



123 TGHRMAWDMMMNWS PTTALWAQLLRI PQAILDMIAGAHWGVLAGIAYFSMVGNWAKVLVV 



123 TGHRMAWDMMMNWSPTaALVVAQLLRI PQAILDMIAGAHWGVLAGIAYFSMVGNWAK37LVV 
I 

123 TGHRMAWDMMMNWSPTTALVVAQLLRIPQAILDMIAGAH^ 



II 

123 TGHRMAWDMMMNWSPTTALVMAQLLRI PQAILDMIAGAHWGVIiAGIAYFSMVGNWAKVVVV 



I 

123 TGHRMAWDMMMNWSPTaALVMAQLLRI PQAIII2MIAGAHWGVLAGIAYFSMVGNWAKVVW 
123 TCaO^WDMMMNWSPTTALVv^ 

lllllllllllllllll HIM MillllMIIIIIIMIIIIMI 1 1 1 1 1 1 1 I 

123 TGHRMAWDMMMNWS PTTALViAQLI^vPQAVIiDMIAGAHWGVIiAG IAYFSMaGNWAKVLl V 
TGHRMAWDMMMNWS PTtJlLVvAQIiRiPQAiLDMXAGA^ 



WO 95/01442 PCTYUS94/07320 

31/47 



FIGURE 2A 



SEO ID NO: 


Isolate 






5© 


S14 


184 


LLLFAGVDA 








lllllllll 


52 


DK7 


184 


LLLFAGVDA 








lllllllll 


59 


DS11 


184 


LLLFAGVDA 








lllllllll 


55 


DR4 


184 


LLLFAGVDA 








lllllllll 


54 


DR1 


184 


LLLFAGVDA 








llll Nil 


53 


DK9 


184 


LLLFtGVDA 








llll llll 


58 


SW1 


184 


LLLFsGVDA 








llll llll 


57 


SI 8 


184 


LLLFaGVDA 


52-59 


consensus 




LLLFaGVDA 
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FIGURE 2B 





JL.SOJ.Buc 
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62 
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HK4 
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67 


IND5 
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73 
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66 


HK6 


1 


61 
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74 


T3 


1 


65 


HK5 


l 


71 


S45 


l 


72 


SA10 


1 


69 


P10 


1 


60 


Dl 


1 


70 


S9 


1 


60-76 


consensus 





II I 



II MM 



rSSRCWVALTFTLAARMAS 



IIIIIIIIIIIIIIIIIIIIMI 

EVRNVSGVYHVTNDCSNSSIVYE 

Ml 

rVTE 

Ml 

YHVTNDCSNSS IVYE 
I 



TPGCVPCVRENNSSRCWV3VLTPTLAAKNAS 

l I 1 1 1 1 1 1 1 1 f 1 1 1 1 1 f 

riTGCVPCVRECfNf SsCWVMjTPTIAARNAS 

l 

TTPGCVPCVREGNS SRCWVMjTPTIAARNAS 

him i 



Mill I 

iTPGCVPCVRENNSSRCWVALTPTlAARNVS 

lllllllll 

rrPGCBiPCVRBRNSSRCWVRLTPTTAARNVS 

Mill f f 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 I 

XTPGCVPCVRBdNSSRCWVAI/TPTLAARNsS 

i 



lllllll 



3TPGCVPCVRE sNS SRCWVAI/TPTLAARNAS 



Mill 



lllllll! 



i i 

Dvl 
I I 



TITGCVPCVREKNSSRCWV3U*aPTIJlARNAS 

lllllll I 



flTPGCVFCVRENNSSRCWVALTPTlAARNSS 



I III 



yEVrNVSGVYhVTlHDCSNsS iVyEaaDmlm^ 



* 
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SEP IP NO 
75 

62 
64 
76 
68 
67 
73 
63 
66 
61 
74 
65 
71 
72 
69 
60 
70 
60-76 



Isolate 
T10 

DK1 
KK4 
US 6 
IND8 
XKD5 
SW2 
HK3 
HK8 
D3 
T3 
HK5 
S45 
SA10 
P10 
Dl 
S9 

consensus 



62 vPTTTIRRHVDLLVGAAAFCSAMYVGDLCGSVFLVSQLFTFS PRRHET1QDCNCS IYPGH1 
62 IPTTTIRraVDIiVGAAAFCSAMrVGDIiOTSWLVSQIiF^ 

II1IIIIII II IMIIIIIMIIIIMllilllllllll I 111 Ml 11 IIIIIMIMII 

62 I PTTTIRRHVDLLVGAAAFCSAMXVGDLCGSVFLVS qlftfs prrhetvqdcncs iypghv 

iiiiiiiiiiiiint Miiiiiiitiiiiii 1 1 1 1 1 1 1 1 1 iiiiiiiiiiiiiin 

62 VPTTTIIUlHVDLLVGAAtFCST^MWGDLCGSVFLiSQLFTFS PRqHETVQDCNCS IYPGHV 

1 1 1 1 1 1 i 1 1 ! 1 1 1 1 1 1 1 IIMIilMIIIIIII llllillil 1 1 1 1 1 1 1 1 1 1 M I f 1 1 

62 VPTTTIRRHVDLLVGAAAFC&AMYVGDLCGSVFLVSQLFTFS PRRHETVQDCNCS IYPGHV 
62 VsTTTIRhHTO^ 

62 VPTTTIRRHVI)i^ IYPGHV 



62 VPTTTIiyUiVDIiLVGAAAFCSAMYVGDLCGSVFLVSQLFTFS PRRHBTVQDCNCS IYPGHV 

lillllllilllllllllllllMMMIIMIMIIIMIMIIM lililll Mill 

62 VPTTTIRRHVDLLVGAAAFCSAMYVGDLCGSVFLVSQLFTFSPR 

IIIIIIIIIMIIIIIMIIIIIIIIIIIIIIIMIIIIIIIIIIII || 1 I 1 1 I ! 1 I | I 
62 VPTTTIRRHVDLLVGAAAFCSAMYVGDLCGSVFLV 

62 VPTk^RRtfl/DLLVGAAA 



62 VPTTailRRHVDIjLVGAAAFCSAMYVGDLCGSVFLVSQLFTFS PRRHETVQDCNCS IYPGHV 



62 VPTTTI RRHVDLLVGAAAFCSAMYVGDLCGSVFLVS QLFTFS PRRHETVQDCNCS IYPGHV 

III Ml II MM Mill II II llllllll 1MIIIIIIIIIIII I IMMMIMIM I 

62 VPTTTIRRHVDLLVGJUU^CSAMYVGDLCGSVFLTC 

Mil I M 1 i I I I I II M I I I I I I 1 I I I I I I I 1 I I i I 1 I f I t I llllllllllll I 
62 VPTTAIRRHVDLLVGAAAFCSAMYVGDL03SV1LVSQLFTO 

MIMIMMIMMIMIMIMIM MM I Mill Mill III 1 1 1 1 1 1 1 1 1 1 

62 VPTTAIRRHVDLLVGAAAFCSAMYVGDLCGSVFLISQLFT^ 

MM IMMIIIMM 1 1 1 1 1 f 1 1 1 1 1 1 It i 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 I II M 1 1 1 1 1 

62 VPTTtlRRHVDLLVGAAvFC^WIXVGDLCXSSVFLISQLFTiS 

vpTttlRrHVDIiVGAAaFCSaMYVGDLCGSVf LvSQLFTf SPRrheTvQd 
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SEP IP NO 
75 

62 

€4 

76 

68 

67 

73 

63 

66 

61 

74 

65 

71 

72 

69 

60 

70 

60-76 



Isolate 
T10 

DK1 
HK4 
US6 
IND8 
1KD5 
SW2 
HK3 
KK8 
D3 
T3 
HK5 
S45 
SA10 
P10 
Dl 
S9 

consensus 



123 SGHF^WDMMI^ 
123 S GHRMAWDMMMNWS PTTALV1 SQLLRI PQAV^MVAGAHWGVIJ^IAYySMAGNWAKVLIV 

_ ill in i ii i ii iii i iii inn mi linn i it in linn i mini.. 

123 SGHRMAWDMMMNWS PTAALWSQLLRl P^VMDMVAGJUIWGVLAGLAYYSMVGNWAKVLIV 



123 SGHRMAWDMMMNWSPTAALWSQLIJlIPQAVMDMVaGAHWGV^ 

„ 1 I 1 f I I I I I 1 1 I I i I I I ( I I M I i I I I 1 I 1 I 1 I I I I I f I f lllllilll IIIIIMI. , 
123 SGHRMAWDMMMNWS PTAALWSQLLRl PQAWDMVAGAHWG I LAG LAYYSMVGNWAKVUV 



123 SGHRMAWDMMMNWSPTAALWSQLLRIPQAWDMVAG&HWGILAGIA^ 
123 SGHRMAWDMMMNWSPTAALWSQLLRIPQAVTO 



123 SGHRMAWDMMMNWSPXAALWSQLLRIPQAVVDMVJAGA^^ 

1 1 II 1 1 1 1 1 1 1 M 1 1 1 lllllilllllll 

123 SGHRMAWDMMMNWSPTtJtfjWSQIJjRIPQAiVDMWVGAHWGV^ 

1 1 1 1 1 1 M I i 1 1 f 1 1 

123 TGHRMAWDMMMNWSPTaALVVSQLLMPQAVVDMVAGAH^ 



123 TGHRMAWDMMMNWS PTTALWSQLLRIPQAVVDMVAGAHWGVLAGLAYY^ 

123 TGHRMAWDMMMNWS PTTALWSQLLRI PQAVVDMVAGAHWGVIJUSIAYYSMVGNWAKVIiIV 

123 TCHRMAWoiwiJnlsPT 
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FIGURE 2C 
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FIGURE 2E 
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FIGURE 2H 
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- 77-80 (III/2a) 1 aq(VkNTst6YMVTNDCSNdSITV^Ix^VI^ 

86-90 (V/3a) 1 LEWRNtSGIiYvLTNDCeNS S I VYKADDVILHTPGCVPCVQDGOTS tCWTpVTPTVAVRYVG 

60-76 (11 /lb) 1 yEVrNVSGvYhVTNDCSNs S i VyEaaDmlmHTPGCvPCVrBnNs S rCWVAL tPTLAARNa 6 

J 52-59 (I/la) 1 yQVRNStGIiYHVTNDCPNSSIVYEaADalLHsPGCVPCVRBgra 

91 <4a) 1 KHYRNASGIYHITNDCPNSSIVyRADHHIIiHI*PGCVPCVKIX3OT 

93-94 (4c) X VNYrNASGVYHvTNDCPNS S I vYKAKHqILHLPGCl PCVRvGNQSRCWV3UiTPTVAveYIG 

95 (4d) 1 YNYRNSSGVYHVTNDCPNSSrtnfETDYHILHLPGCVPCVRE^ 

92 (4b) 1 VHYRNASGVYHVTNDCPNTS rvnfKTEHHIMHLPGCWCVRTENTSRCWVPLTPTVAAP YPN 

96-101 (5a) 1 VPYRNASGVYHVTNDCPNSSrVYEJffinLILHAPGCVPCVrqdWVS 
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60-76 (Il/lb) 62 vpTt t IRrHVDLLVGAAaFCS aMYVGDI/ZGSVf LvSQLFTf S PRrheTvQdCNCSi Y^Ghv 

52-59 (I/la) 62 LPatQUUtolDIiLVGSATLCSALYVGDLTO 

91 (4a) 62 APLESFRRHVDLMVG/tflTLCSJ^YVGDLCGG 

93-94 (4c) 62 APLdS IRRHVDIWVGAATVCSJ^YvGDLCGGaFLVGQMPS FQPRRHWTTQDCNCSIYAGHi 

95 (4d) 62 API^SLRRHVDLMVGGATLCSALYIGDVCGGVFLV^ 

92 (4b) 62 APLESMRRHVDI2*VGAATMCSAFYIGDLCGGVFLVGQ 
96-101 (5a) 62 AWAPLRRaVDYLAGGAALCSALYVGDaCGAvFIM 

102 (6a) 62 TPATGPRRHVDIJAGAAVVCSSLYIGDIiCGSLFIAGQ 
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91 (4a) 123 TGHRMAWDMMMNWSPTTTLLLAQIM^ 

93-94 (4c) 123 TGHRMAKDMMMNWSPTTTLl LAQVURJ PSTLVlJLIjaGGHWGvLvGlAYFsMQANWAKVILV 

95 (4d) 123 TGHRMAWDMMMNWSPTATI#VXAQIJMRIPGAMVDLIjAGG 

92 (4b) 123 SGHRMATOMMMNWSPTSALIMAQIIjRIPSILGDIXTGGHWGVIAG 
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102 (6a) 123 TGHRMAWDMMMNWSPTTTLVliSSIIiRVPBICASVIFGGHWGI 

52-102 consenus GHRHAKDMM NWSP R p G HWG A W KV 
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